On Thursday 19 August 2004 06:45 am, Todd Slater wrote:

> 2. Get unique words to count from masterwordlist.
>
>    uniq masterwordlist > uniqwords
>
> 3. Count the number of times a word in uniqwords appears in
>    masterwordlist.
>
>    for line in `cat uniqwords` ; do echo $line : `grep -c $line
>    masterwordlist` >> countedwords ; done
>
>    (that should all be one line)

You might consider combining these steps with "uniq -c "

> There's probably more n better ways to do it, but that should work.
> Modify to suit your needs, like if you want to distinguish between A and
> a.

Toss a 'tr [A-Z] [a-z]' into the mix for that.  You end up with something like 
this:

tr ' \011' '\012\012' < text.txt | tr [A-Z] [a-z] | sort | uniq -c

There's still punctuation in the mix.

____________________________________________________
Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com
Join the Club : http://www.mandrakeclub.com
____________________________________________________

Reply via email to