Hard dataset experiment

The is the hardest dataset among the three built upon Taobao collection data and local SEA e-commerce data. Huge difference are expected on image style, product distribution as well as watermarks. The objective is to test out the difference in the 3 following models to the best extent: baseline Siamese network, Vanilla MoCo, Enhance MoCo(our version)

Siamese v.s. Vanilla MoCo v.s. Enhanced MoCo

There are 2 metrics that we care about the most for the actual downstream applications. The total amount of identical items that we could capture through vision: reflected by Coverage. Among these recalled items, how many of them are true positive matches: reflected by Precision.

  • Baseline model performs poorly on hard-dataset mainly due to vision disturbances.
  • Vanilla MoCo shows great improvement. Good choice as requires no additional fine-tuning.
  • Enhanced MoCo demonstrate the best performance in both coverage and precision