Hi Abhishek, You need to build up your metric for "probability" first. For e.g., 1. keywords occurrence/total words count 2. Keywords occurrence/total sentences 3. the number of files who contain keyword / total files number
Best Regards, James Fang -----邮件原件----- 发件人: algogeeks@googlegroups.com [mailto:[EMAIL PROTECTED] 代表 Abhishek 发送时间: 2007年12月3日 16:10 收件人: Algorithm Geeks 主题: [algogeeks] Probability of a phrase in a text document? Hi, If I have a large corpus of text documents and I need to find the probability of occurence of a phrase like "I am" in the given set of text documents, how do I go about finding the value? I can very well search how many time does the phrase "I am" occurs in the whole set of text documents including all the sentences, but what do i divide the count by? Thanks With Regards, Abhishek S --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---