Re: Problem using Lucene on Ubuntu

Grant Ingersoll Mon, 18 Feb 2008 06:22:54 -0800

Good point Jan!

On Feb 18, 2008, at 9:00 AM, Jan Peter Stotz wrote:

Grant Ingersoll wrote:
Note: ENCODING is whatever encoding the file is in, as in "UTF-8",if that is what your files are in.
I think there is a misunderstanding, the WordExtractor extracts textfrom MS Word (.doc) files. Those files are binary and therefore doesnot have an encoding.I would print out the extracted text into a plain text files andcompare if there are differences between the file generated onWindows and Linux/Ubuntu. This allows to determine if this is aWordExtractor or a Lucene problem.
Jan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Problem using Lucene on Ubuntu

Reply via email to