Thanks Boris,
Jason
Boris Aleksandrovsky wrote:
Jason,
You can look here:
http://www.cs.ualberta.ca/~lindek/downloads.htm
for
Word frequency counts from a 1.5B word corpus (TREC disks 1-5 and the
Reuters
corpus <http://about.reuters.com/researchandstandards/corpus/>). The
words
are normalized as follows: ALL CAP words are prepended with a_ and
Capitalized words are prepended with c_ after downcasing. Digits are all
replaced with 0.
Cheers,
Boris
On 8/30/06, Jason Pump <[EMAIL PROTECTED]> wrote:
Is there a large list of words and their frequency in the english
language? Obviously it would differ by corpus but I would like to see
what's already available.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]