--- ABDOU Samir <[EMAIL PROTECTED]> wrote: > Hi, > > Are there any ideas on how to compute the "document > frequency" and "collection frequency" of phrases?
Tokenize your input as phrases (instead of words), and you'll get this the same way you normally get stats for single-word tokens (Terms)? I did that for bigram frequency analysis. Of course, the problem is hardly getting these stats, problem is finding what constitutes a phrase. ;-) -+ Tatu +- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]