RE: StandardTokenizer and split tokens

2012-06-24 Thread Uwe Schindler
> -Original Message- > From: Mansour Al Akeel [mailto:mansour.alak...@gmail.com] > Sent: Saturday, June 23, 2012 11:21 PM > To: java-user@lucene.apache.org > Subject: Re: StandardTokenizer and split tokens > > Uwe, > thank you for the advice. I updated my code. > > >

Re: StandardTokenizer and split tokens

2012-06-23 Thread Mansour Al Akeel
Uwe, thank you for the advice. I updated my code. On Sat, Jun 23, 2012 at 3:15 AM, Uwe Schindler wrote: >> I found the main issue. >> I was using ByteRef without the length. This fixed the problem. >> >>                       String word = new > String(ref.bytes,ref.offset,ref.length); > > Pleas

RE: StandardTokenizer and split tokens

2012-06-23 Thread Uwe Schindler
> I found the main issue. > I was using ByteRef without the length. This fixed the problem. > > String word = new String(ref.bytes,ref.offset,ref.length); Please see my other mail, using no character set here is the second problem of your code, this is the correct way to do:

RE: StandardTokenizer and split tokens

2012-06-23 Thread Uwe Schindler
Don't ever do this: String word = new String(ref.bytes); This has following problems: - ignores character set!!! (in general: never ever use new String(byte[]) without specifying the 2nd charset parameter!). byte[] != String. Depending on the default charset on your computer this would return bul

Re: StandardTokenizer and split tokens

2012-06-22 Thread Mansour Al Akeel
I found the main issue. I was using ByteRef without the length. This fixed the problem. String word = new String(ref.bytes,ref.offset,ref.length); Thank you. On Fri, Jun 22, 2012 at 6:26 PM, Mansour Al Akeel wrote: > Hello all, > > I am tying to write a simple autosugg