; Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: jm [mailto:jmugur...@gmail.com]
>> Sent: Wednesday, April 21, 2010 3:59 PM
>> To: java-user@lucene.apache
t; From: jm [mailto:jmugur...@gmail.com]
> Sent: Wednesday, April 21, 2010 3:59 PM
> To: java-user@lucene.apache.org
> Subject: Re: are long words split into up to 256 long tokens?
>
> oh, yes it does extend CharTokenizer..thanks Ahmet. I had searched
> lucene source code for 256 and found
oh, yes it does extend CharTokenizer..thanks Ahmet. I had searched
lucene source code for 256 and found nothing suspicious, and that was
itself suspicious cause it looked clearly like an inner limit. Of
course I should have searched for 255...
I'll see how I proceed cause I don't want to use a cus
> Is 256 some inner maximum too
> in some
> lucene internal that causes this? What is happening is that
> the long
> word is split into smaller words up to 256 and then the min
> and max
> limit applied. Is that correct? I have removed LengthFilter
> and still
> see the splitting at 256 happen. I w
I am analizying this wiht my custom analyzer:
String s = "mail77 mail8 tc ro45mine durante
jjkk