[
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234096#comment-13234096
]
Christian Moen commented on LUCENE-3897:
----------------------------------------
Robert, your change to LUCENE-3895 is very useful. Thanks again for this.
I can reproduce a failing case on {{trunk}} on my system using
{noformat}
ant test -Dtestcase=TestKuromojiTokenizer -Dtestmethod=testRandomHugeStrings
-Dtests.seed=-42f0565412819c1e:75f7606c1595bc3f:-31754ca508d64340
-Dargs="-Dfile.encoding=MacRoman"
{noformat}
and the output is as follows:
{noformat}
[junit] Testsuite: org.apache.lucene.analysis.kuromoji.TestKuromojiTokenizer
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 16.122 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] NOTE: Ignoring @nightly test method 'testBocchanBig'
[junit]
[junit] ===>
[junit] Uncaught exception by thread: Thread[Thread-4,5,main]
[junit] java.lang.AssertionError: backPos=3076 vs lastBackTracePos=4096
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:49)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:334)
[junit] <===
[junit]
[junit] NOTE: reproduce with: ant test -Dtestcase=TestKuromojiTokenizer
-Dtestmethod=null
-Dtests.seed=-42f0565412819c1e:75f7606c1595bc3f:-31754ca508d64340
-Dargs="-Dfile.encoding=MacRoman"
[junit] ------------- ---------------- ---------------
[junit] Testcase:
testRandomHugeStrings(org.apache.lucene.analysis.kuromoji.TestKuromojiTokenizer):
Caused an ERROR
[junit] Uncaught exception by thread: Thread[Thread-4,5,]
[junit]
org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
Uncaught exception by thread: Thread[Thread-4,5,]
[junit] at
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:66)
[junit] at
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18)
[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
[junit] at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
[junit] at
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
[junit] at
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
[junit] at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
[junit] at
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
[junit] at
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
[junit] at
org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
[junit] at
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
[junit] at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
[junit] at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
[junit] at
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:57)
[junit] at
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
[junit] at
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
[junit] at
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
[junit] Caused by: java.lang.AssertionError: backPos=3076 vs
lastBackTracePos=4096
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
[junit] at
org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:49)
[junit] at
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:334)
[junit]
[junit]
[junit] Test org.apache.lucene.analysis.kuromoji.TestKuromojiTokenizer
FAILED
{noformat}
> KuromojiTokenizer fails with large docs
> ---------------------------------------
>
> Key: LUCENE-3897
> URL: https://issues.apache.org/jira/browse/LUCENE-3897
> Project: Lucene - Java
> Issue Type: Bug
> Components: modules/analysis
> Reporter: Robert Muir
> Fix For: 3.6, 4.0
>
>
> just shoving largeish random docs triggers asserts like:
> {noformat}
> [junit] Caused by: java.lang.AssertionError: backPos=4100 vs
> lastBackTracePos=5120
> [junit] at
> org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
> [junit] at
> org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
> [junit] at
> org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
> [junit] at
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
> {noformat}
> But, you get no seed...
> I'll commit the test case and @Ignore it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]