Thanks, Jack. I haven't added myself to the contributor list yet, will do that and then login and comment on that ticket. One quick comment: wouldn't it be more reasonable to throw exception it a token length is more than 255, if relaxing that limit is still debatable? This way user would know immediately something is wrong.
On Friday, August 15, 2014, Jack Krupansky <[email protected]> wrote: > Yeah, it should be documented better, and configurable. > > Some discussion of related issues here: > https://issues.apache.org/jira/browse/LUCENE-1118 > https://issues.apache.org/jira/browse/SOLR-4148 > > I actually filed a Jira for this already. No action so far, but PLEASE > feel free to comment on it: > https://issues.apache.org/jira/browse/LUCENE-5785 > > -- Jack Krupansky > > -----Original Message----- From: Sheng > Sent: Thursday, August 14, 2014 11:38 PM > To: [email protected] > Subject: WhiteSpaceTokenizer > > The length of token has to be shorter than 255, otherwise there will > be unpredictable behaviors for this tokenizer. I see 255 is set as a > private final in the src code, but there is no documentation to explicitly > address that. Can we either make that number configurable (if not an > option, I'd like to know why), or put some notes to its java doc? I had a > hard time to figure that out... > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
