Ryan Ernst created LUCENE-5927:
----------------------------------
Summary: 4.9 -> 4.10 change in StandardTokenizer behavior on \u1aa2
Key: LUCENE-5927
URL: https://issues.apache.org/jira/browse/LUCENE-5927
Project: Lucene - Core
Issue Type: Bug
Reporter: Ryan Ernst
In 4.9, this string was broken into 2 tokens by StandardTokenizer:
"\u1aa2\u1a7f\u1a6f\u1a6f\u1a61\u1a72" = "\u1aa2", "
\u1a7f\u1a6f\u1a6f\u1a61\u1a72"
However, in 4.10, that has changed so it is now a single token returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]