[
https://issues.apache.org/jira/browse/LUCENE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284070#comment-13284070
]
Dawid Weiss commented on LUCENE-4078:
-------------------------------------
This bug is caused by Robert's evil random Pattern instances. I think it
qualifies as a bug in the JDK's implementation... but then, I'm not sure. Here
it is -- the Pattern:
{code}
Pattern p = Pattern.compile("]|");
{code}
For those of you who wondered, this also is a valid pattern:
{code}
Pattern p = Pattern.compile("|");
{code}
What should these match? I have no clue. Are they even valid? I have no clue.
Anyway, what happens is that surrogate pairs (and everything else) is divided,
so:
{code}
String s1 = "AB\uD840\uDC00C";
String s2 = s1.replaceAll("]|", "xyz");
System.out.println(s1);
System.out.println(s2);
{code}
results in:
{noformat}
AB𠀀C
xyzAxyzBxyz?xyz?xyzCxyz
{noformat}
and an assertion is righteously thrown saying a surrogate pair is broken.
Don't know what to do with this. Robert?
> PatternReplaceCharFilter assertion error
> ----------------------------------------
>
> Key: LUCENE-4078
> URL: https://issues.apache.org/jira/browse/LUCENE-4078
> Project: Lucene - Java
> Issue Type: Bug
> Reporter: Dawid Weiss
> Assignee: Dawid Weiss
> Priority: Minor
> Fix For: 4.0
>
>
> Build: https://builds.apache.org/job/Lucene-trunk/1942/
> 1 tests failed.
> REGRESSION:
> org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings
> Error Message:
> Stack Trace:
> java.lang.AssertionError
> at
> __randomizedtesting.SeedInfo.seed([8E91A6AC395FEED9:618A6129A5BB9EC]:0)
> at
> org.apache.lucene.analysis.MockTokenizer.readCodePoint(MockTokenizer.java:153)
> at
> org.apache.lucene.analysis.MockTokenizer.incrementToken(MockTokenizer.java:123)
> at
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:558)
> at
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:488)
> at
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:430)
> at
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:424)
> at
> org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings(TestPatternReplaceCharFilter.java:323)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
> at
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
> at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
> at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at
> org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
> at
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
> at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(Randomized
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]