subject:"Lucene Unicode Usage"

Re: Lucene Unicode Usage

2005-02-11 Thread Owen Densmore

Bingo! I used the InputStreamReader and that fixed the index. Boy, tough to catch all the holes through which unicode leaks occur! Owen From: aurora <[EMAIL PROTECTED]> Date: February 9, 2005 11:04:35 PM MST To: lucene-user@jakarta.apache.org Subject: Re: Lucene Unicode Usage So you got

Re: Lucene Unicode Usage

2005-02-10 Thread Andrzej Bialecki

Owen Densmore wrote: I'm building an index from a FileMaker database by dumping the data to a tab-separated file. Because the FileMaker output is encoded in MacRoman, and uses Mac line separators, I run a script across the tab file to clean it up: tr '\r\v' '\n ' | iconv -f MAC -t UTF-8 Thi

Re: Lucene Unicode Usage

2005-02-09 Thread aurora

So you got a utf8 encoded text file. But how do you read the file into Java? The default encoding of Java is likely to be something other than utf8. Make sure you specify the encoding like: InputStreamReader( new FileInputStream(filename), "UTF-8"); On Wed, 9 Feb 2005 22:32:38 -0700, Owen De

Lucene Unicode Usage

2005-02-09 Thread Owen Densmore

I'm building an index from a FileMaker database by dumping the data to a tab-separated file. Because the FileMaker output is encoded in MacRoman, and uses Mac line separators, I run a script across the tab file to clean it up: tr '\r\v' '\n ' | iconv -f MAC -t UTF-8 This basically converts the

Re: Lucene Unicode Usage

Re: Lucene Unicode Usage

Re: Lucene Unicode Usage

Lucene Unicode Usage

4 matches

Site Navigation

Mail list logo

Footer information