Thanks very much for all of your tips. Take noun as an example. First, I need find all the lemma_names in all the synsets whose pos is 'n'. Second, for each lemma_name, I will check all their sense number.
1) Surely,we can know the number of synset whose pos is noun by >>> len([synset for synset in wn.all_synsets('n')]) 82115 However, confusingly it is unsuccessful to get a list of lemma names of these synsets by >>> lemma_list = [synset.lemma_names for synset in wn.all_synsets('n')] >>> lemma_list[:20] [['entity'], ['physical_entity'], ['abstraction', 'abstract_entity'], ['thing'], ['object', 'physical_object'], ['whole', 'unit'], ['congener'], ['living_thing', 'animate_thing'], ['organism', 'being'], ['benthos'], ['dwarf'], ['heterotroph'], ['parent'], ['life'], ['biont'], ['cell'], ['causal_agent', 'cause', 'causal_agency'], ['person', 'individual', 'someone', 'somebody', 'mortal', 'soul'], ['animal', 'animate_being', 'beast', 'brute', 'creature', 'fauna'], ['plant', 'flora', 'plant_life']] >>> type(lemma_list) <type 'list'> Though the lemma_list is a list in the above codes, it contains so many unnecessary [ and ]. How come it is like this? But what we desire and expect is a list without this brackets. Confused, I am really curious to know why. 2) Then I have to use a loop and extend to get all the lemma_names from synset: >>> synset_list = list(wn.all_synsets('n')) >>> lemma_list = [] >>> for synset in synset_list: lemma_list.extend(synset.lemma_names) >>> lemma_list[:20] ['entity', 'physical_entity', 'abstraction', 'abstract_entity', 'thing', 'object', 'physical_object', 'whole', 'unit', 'congener', 'living_thing', 'animate_thing', 'organism', 'being', 'benthos', 'dwarf', 'heterotroph', 'parent', 'life', 'biont'] 3) In this case, I have to use loop to get all the lemma_names instead of [synset.lemma_names for synset in wn.all_synsets('n')]. The following is a working solution: >>> def average_polysemy(pos): synset_list = list(wn.all_synsets(pos)) sense_number = 0 lemma_list = [] for synset in synset_list: lemma_list.extend(synset.lemma_names) for lemma in lemma_list: sense_number_new = len(wn.synsets(lemma, pos)) sense_number = sense_number + sense_number_new return sense_number/len(synset_list) >>> average_polysemy('n') 3 Thanks again. -- http://mail.python.org/mailman/listinfo/python-list