On Tue, 17 Feb 2004, Jari Aalto+mail.linux wrote: > * Mon 2004-02-16 Igor Pechtchanski > | On Tue, 17 Feb 2004, Jari Aalto+mail.linux wrote: > | > | > Extract text from MS-Word files, trying to preserve as many special > | > printable characters as possible. Catdoc doesn't attempt to analyze > | > Word file formatting, it just extracts readable text. Known to > | > support up to Word-97 format. > | > > | > http://freshmeat.net/projects/catdoc/ > | > | Question: how is this different from 'antiword'? > | Igor > > catdoc is the "original". Essentially these two are the same. > I ran a simple tests with these two and I looked like catdoc > preserved paragraph bounds together better than antiword > (which stuck lines together). > > Why not have both! > > Jari
No contest from me here. Just axin'... ;-) Igor P.S. This has my vote. -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ [EMAIL PROTECTED] ZZZzz /,`.-'`' -. ;-;;,_ [EMAIL PROTECTED] |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "I have since come to realize that being between your mentor and his route to the bathroom is a major career booster." -- Patrick Naughton