Hello Kent, Bob, Steve Thank you all for your help and suggestions. I believe I 'll have a good program soon.
2008/8/17 Kent Johnson <[EMAIL PROTECTED]> > On 8/16/08, Emad Nawfal (عماد نوفل) <[EMAIL PROTECTED]> wrote: > > #! usr/bin/python > > # Chi-squared collocation discovery > > # Important definitions first. Let's suppose that we > > # are trying to find whether "powerful computers" is a collocation > > # N = The number of all bigrams in the corpus > > # O11 = how many times the bigram "powerful computers" occurs in the > corpus > > # O22 = the number of bigrams not having either word in our collocation = > N > > - O11 > > # O12 = The number of bigrams whose second word is our second word > > # but whose first word is not "powerful" > > This is just the number of occurrances of the second word - O11, isn't it? > > > # O21 = The number of bigrams whose first word is our first word, but > whose > > second word > > # is different from oour second word > > This is the number of occurrances of the first word - O11. > > So one way to solve this would be to make two dictionaries - one which > counts bigrams and one which counts words. Then you would get the > numbers with just three dictionary lookups. > > Kent > -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com --------------------------------------------------------
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor