Thanks Boris,

Jason

Boris Aleksandrovsky wrote:
Jason,

You can look here:

http://www.cs.ualberta.ca/~lindek/downloads.htm

for

Word frequency counts from a 1.5B word corpus (TREC disks 1-5 and the Reuters corpus <http://about.reuters.com/researchandstandards/corpus/>). The words
are normalized as follows: ALL CAP words are prepended with a_ and
Capitalized words are prepended with c_ after downcasing. Digits are all
replaced with 0.

Cheers,
Boris

On 8/30/06, Jason Pump <[EMAIL PROTECTED]> wrote:

Is there a large list of words and their frequency in the english
language? Obviously it would differ by corpus but I would like to see
what's already available.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to