--- ABDOU Samir <[EMAIL PROTECTED]> wrote:

> Hi,
>  
> Are there any ideas on how to compute the "document
> frequency" and "collection frequency" of phrases?

Tokenize your input as phrases (instead of words), and
you'll get this the same way you normally get stats
for single-word tokens (Terms)? I did that for bigram
frequency analysis.

Of course, the problem is hardly getting these stats,
problem is finding what constitutes a phrase. ;-)

-+ Tatu +-


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to