[jira] [Commented] (LUCENE-5927) 4.9 -> 4.10 change in StandardTokenizer behavior on \u1aa2

Steve Rowe (JIRA) Mon, 08 Sep 2014 13:42:57 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126064#comment-14126064
 ]


Steve Rowe commented on LUCENE-5927:
------------------------------------

[~rjernst] mentioned to me offline that this behavior change should have 
triggered a version-specific implementation, which did not happen.

I agree, it should have.  

But now that it's been released, should we include a version-specific 
implementation in a bugfix 4.10.1 release?  Or wait till 4.11?  Or just stop 
doing version-specific implementations (as will be the case in 5.x)?

Thoughts?

> 4.9 -> 4.10 change in StandardTokenizer behavior on \u1aa2
> ----------------------------------------------------------
>
>                 Key: LUCENE-5927
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5927
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Ryan Ernst
>
> In 4.9, this string was broken into 2 tokens by StandardTokenizer:
> "\u1aa2\u1a7f\u1a6f\u1a6f\u1a61\u1a72" = "\u1aa2", " 
> \u1a7f\u1a6f\u1a6f\u1a61\u1a72"
> However, in 4.10, that has changed so it is now a single token returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-5927) 4.9 -> 4.10 change in StandardTokenizer behavior on \u1aa2

Reply via email to