Word Vector Evaluation

Step 1: Filter Your Vectors


Your vectors should be in the following format:
    london 0.102 -4.31 -0.003 ...
    paris -1.23 3.450 -0.03 ...
    
Download the following vocabulary file and script and run:
    python filterVocab.py fullVocab.txt < yourVectors(.txt/.txt.gz) > filtVectors.txt
    
Now you can upload the filtered vectors!

Step 2: Upload Filtered Vectors:


(in .txt/.txt.gz format)

Step 3: Plot Your Words

Word Pair Similarity Ranking


No. Task Name Word pairs Reference Pairs found Correlation
1 WS-353 353 Finkelstein et. al, 2002
2 WS-353-SIM 203 Agirre et. al, 2009
3 WS-353-REL 252 Agirre et. al, 2009
4 MC-30 30 Miller and Charles, 1991
5 RG-65 65 R and G, 1965
6 Rare-Word 2034 Luong et. al, 2013
7 MEN 3000 Bruni et. al, 2012
8 MTurk-287 287 Radinsky et. al, 2011
9 MTurk-771 771 Halawi and Dror, 2012
10 YP-130 130 Yang and Powers, 2006
11 SimLex-999 999 Hill et. al, 2014
12 Verb-143 143 Baker et. al, 2014

Default Word Plots

t-SNE tool, Maaten and Hinton 2008

Antonym and Synonyms


Male and Female

Reference


If you use this website, please cite the following paper:
@InProceedings{faruqui-2014:SystemDemo,
  author    = {Faruqui, Manaal  and  Dyer, Chris},
  title     = {Community Evaluation and Exchange of Word Vectors at wordvectors.org},
  booktitle = {Proceedings of the 52nd Annual Meeting of the 
               Association for Computational Linguistics: System Demonstrations},
  month     = {June},
  year      = {2014},
  address   = {Baltimore, USA},
  publisher = {Association for Computational Linguistics}
}