Thanks very much for all of your tips. Take noun as an example. First, I need 
find all the lemma_names in all the synsets whose pos is 'n'. Second, for each 
lemma_name, I will check all their sense number. 

1)  Surely,we can know the number of synset whose pos is noun by 
>>> len([synset for synset in wn.all_synsets('n')])
82115

However, confusingly it is unsuccessful to get a list of lemma names of these 
synsets by 
>>> lemma_list = [synset.lemma_names for synset in wn.all_synsets('n')]
>>> lemma_list[:20]
[['entity'], ['physical_entity'], ['abstraction', 'abstract_entity'], 
['thing'], ['object', 'physical_object'], ['whole', 'unit'], ['congener'], 
['living_thing', 'animate_thing'], ['organism', 'being'], ['benthos'], 
['dwarf'], ['heterotroph'], ['parent'], ['life'], ['biont'], ['cell'], 
['causal_agent', 'cause', 'causal_agency'], ['person', 'individual', 'someone', 
'somebody', 'mortal', 'soul'], ['animal', 'animate_being', 'beast', 'brute', 
'creature', 'fauna'], ['plant', 'flora', 'plant_life']]
>>> type(lemma_list)
<type 'list'>

Though the lemma_list is a list in the above codes, it contains so many 
unnecessary [ and ]. How come it is like this? But what we desire and expect is 
a list without this brackets. Confused, I am really curious to know why.

2)  Then I have to use a loop and extend to get all the lemma_names from synset:
>>> synset_list = list(wn.all_synsets('n'))
>>> lemma_list = []
>>> for synset in synset_list:
        lemma_list.extend(synset.lemma_names)
>>> lemma_list[:20]
['entity', 'physical_entity', 'abstraction', 'abstract_entity', 'thing', 
'object', 'physical_object', 'whole', 'unit', 'congener', 'living_thing', 
'animate_thing', 'organism', 'being', 'benthos', 'dwarf', 'heterotroph', 
'parent', 'life', 'biont']

3) In this case, I have to use loop to get all the lemma_names instead of 
[synset.lemma_names for synset in wn.all_synsets('n')]. The following is a 
working solution:

>>> def average_polysemy(pos):
 synset_list = list(wn.all_synsets(pos))
 sense_number = 0
 lemma_list = []
 for synset in synset_list:
  lemma_list.extend(synset.lemma_names)
 for lemma in lemma_list:
  sense_number_new = len(wn.synsets(lemma, pos))
  sense_number = sense_number + sense_number_new
 return sense_number/len(synset_list)

>>> average_polysemy('n')
3

Thanks again.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to