Thanks indeed for all your suggestions. When I try my above codes, what puzzles
me is that when the data in the dictionary increase, some data become missing
in the sorted result. Quite odd. In the pairs, we have {'journey':'voyage'} but
in the sorted result no ('journey-voyage',0.25), which did appear in my first
post which was a small scale experiment. I am quite puzzled...
>>> pairs = {'car':'automobile', 'gem':'jewel',
>>> 'journey':'voyage','boy':'lad','coast':'shore', 'asylum':'madhouse',
>>> 'magician':'wizard', 'midday':'noon', 'furnace':'stove', 'food':'fruit',
>>> 'bird':'cock', 'bird':'crane', 'tool':'implement', 'brother':'monk',
>>> 'lad':'brother', 'crane':'implement', 'journey':'car', 'monk':'oracle',
>>> 'cemetery':'woodland', 'food':'rooster', 'coast':'hill',
>>> 'forest':'graveyard', 'shore':'woodland', 'monk':'slave',
>>> 'coast':'forest','lad':'wizard', 'chord':'smile', 'glass':'magician',
>>> 'rooster':'voyage', 'noon':'string'}
>>> list_simi=[]
>>> for key in pairs:
word1 = wn.synset(str(key) + '.n.01')
word2 = wn.synset(str(pairs[key])+'.n.01')
similarity = word1.path_similarity(word2)
list_simi.append((key+'-'+pairs[key],similarity))
>>> from operator import itemgetter
>>> sorted(list_simi, key=itemgetter(1), reverse=True)
[('midday-noon', 1.0), ('car-automobile', 1.0), ('tool-implement', 0.5),
('boy-lad', 0.3333333333333333), ('lad-wizard', 0.2), ('monk-slave', 0.2),
('shore-woodland', 0.2), ('magician-wizard', 0.16666666666666666),
('brother-monk', 0.125), ('asylum-madhouse', 0.125), ('gem-jewel', 0.125),
('cemetery-woodland', 0.1111111111111111), ('bird-crane', 0.1111111111111111),
('glass-magician', 0.1111111111111111), ('crane-implement', 0.1),
('chord-smile', 0.09090909090909091), ('coast-forest', 0.09090909090909091),
('furnace-stove', 0.07692307692307693), ('forest-graveyard',
0.07142857142857142), ('food-rooster', 0.0625), ('noon-string',
0.058823529411764705), ('journey-car', 0.05), ('rooster-voyage',
0.041666666666666664)]
--
http://mail.python.org/mailman/listinfo/python-list