Re: Handy word list program for indexing

2007-03-12 Thread Alan G Isaac
Did Steve's question about order ever get answered? I think he wanted something like this? fmt -1 file_name | sort | uniq -c | sort -dk2 | sort -srnk1 Cheers, Alan Isaac PS Here's another Python implementation, which adds a couple features: minimum frequency and minimum size requirements. (Also

Re: Handy word list program for indexing

2007-03-09 Thread Michael Wojcik
Steve Litt wrote: fmt -1 < tsjustfacts.txt | sed -e "s/^[[:space:][:punct:]]*//" | sed -e "s/[[:space:][:punct:]]*$//" | tr [:upper:] [:lower:] | sort | uniq -c | sort -rn The one thing this doesn't do is, upon final sort, sort by count descending but name ascending. Can you think of a w

Re: Handy word list program for indexing

2007-03-06 Thread William Adams
On Mar 6, 2007, at 8:57 AM, Steve Litt wrote: > There's budget for a human indexer, as long as the indexer is me > (the author). Got it. > So as the human indexer, how do I make this thing an index instead > of a > concordance? A concordance is just a list of words in a document w/ referenc

Re: Handy word list program for indexing

2007-03-06 Thread Alan G Isaac
On Mon, 5 Mar 2007, Steve Litt apparently wrote: > In preparation to create my index for my book, I created a Ruby program to > list every word in a file (in this case the .lyx file). ... > My program, which is written in Ruby, is licensed GNU GPL > version 2, and is included as the remainder

Re: Handy word list program for indexing

2007-03-06 Thread Steve Litt
On Tuesday 06 March 2007 08:25, William Adams wrote: > While such utilities can be useful for the naïve user, they don't > result in an index, so much as a concordance, and the difference > between the two should be kept in mind. > > Rather than relying on such, if the project and budget warrant it

Re: Handy word list program for indexing

2007-03-06 Thread Steve Litt
On Tuesday 06 March 2007 02:58, [EMAIL PROTECTED] wrote: > On Tue, 6 Mar 2007, Steve Litt wrote: > > Indexing is the most distasteful, boring, and tedious part of writing a > > book. Making word lists like this at least makes it a brainless > > activity. > > I've linked to this thread from the foll

Re: Handy word list program for indexing

2007-03-06 Thread William Adams
While such utilities can be useful for the naïve user, they don't result in an index, so much as a concordance, and the difference between the two should be kept in mind. Rather than relying on such, if the project and budget warrant it, far better to employ a human indexer (who is _not_ als

Re: Handy word list program for indexing

2007-03-06 Thread Jeremy C. Reed
> The one thing this doesn't do is, upon final sort, sort by count descending > but name ascending. Can you think of a way to do that with standard Linux > commands? I am not sure I understand (or maybe I should read this again when I wake up :) Can you give a short example?

Re: Handy word list program for indexing

2007-03-05 Thread christian . ridderstrom
On Tue, 6 Mar 2007, Steve Litt wrote: Indexing is the most distasteful, boring, and tedious part of writing a book. Making word lists like this at least makes it a brainless activity. I've linked to this thread from the following page http://wiki.lyx.org/Tips/Indexing Maybe you coul

Re: Handy word list program for indexing

2007-03-05 Thread Steve Litt
Hi Jeremy, On Monday 05 March 2007 21:05, Jeremy C. Reed wrote: > On Mon, 5 Mar 2007, Steve Litt wrote: > > In preparation to create my index for my book, I created a Ruby program > > to list every word in a file (in this case the .lyx file). > > > > Now of course this could be done with a simple

Re: Handy word list program for indexing

2007-03-05 Thread Jeremy C. Reed
On Mon, 5 Mar 2007, Steve Litt wrote: > In preparation to create my index for my book, I created a Ruby program to > list every word in a file (in this case the .lyx file). > > Now of course this could be done with a simple one-liner using sed and > sort -u, but my program lists the words in 2