Re: setting encoding

2002-05-21 Thread Dario Novakovic
ucene takes care of the rest. thanks everybody for suggestions dario >From: "redpineseed" <[EMAIL PROTECTED]> >Reply-To: "Lucene Users List" <[EMAIL PROTECTED]> >To: "Lucene Users List" <[EMAIL PROTECTED]> >Subject: Re: setting encodin

Re: setting encoding

2002-05-20 Thread redpineseed
> The biggest problem is some cp1252 characters are "private" in the unicode > byte set. those chararcters may not be in the unicode byte (char) set at all and that is the major trouble with processing chinese, convert your native code to unicode (UTF16) with the following lines: File

Re: setting encoding

2002-05-19 Thread Peter Carlson
I don't know how have Lucene store in cp1252 (Windows latin-1), but I don't think you have to. I'm pretty sure it will take what ever information you have in a Java String and save it as unicode. Then recreate it into a Java String. So the issue I think you have is converting from cp1252 into a J