[
https://issues.apache.org/jira/browse/LUCENE-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136911#comment-14136911
]
Michael McCandless commented on LUCENE-5400:
--------------------------------------------
Thanks for backporting [[email protected]]!
Hmm, now I'm hitting this test failure on 4.9.x:
{noformat}
ant test -Dtestcase=TestStandardAnalyzer
-Dtests.method=testRandomHugeStringsGraphAfter -Dtests.seed=65FB3AF41D805AF9
-Dtests.locale=mk_MK -Dtests.timezone=Etc/GMT+5 -Dtests.file.encoding=UTF-8
[junit4] FAILURE 0.41s |
TestStandardAnalyzer.testRandomHugeStringsGraphAfter <<<
[junit4] > Throwable #1: java.lang.AssertionError
[junit4] > at
__randomizedtesting.SeedInfo.seed([65FB3AF41D805AF9:CA1B98C5DDF4A2CB]:0)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:751)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:614)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:513)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:437)
[junit4] > at
org.apache.lucene.analysis.core.TestStandardAnalyzer.testRandomHugeStringsGraphAfter(TestStandardAnalyzer.java:402)
[junit4] > at java.lang.Thread.run(Thread.java:745)
[junit4] 2> NOTE: test params are: codec=Lucene46,
sim=RandomSimilarityProvider(queryNorm=false,coord=no): {}, locale=mk_MK,
timezone=Etc/GMT+5
[junit4] 2> NOTE: Linux 3.13.0-32-generic amd64/Oracle Corporation
1.7.0_55 (64-bit)/cpus=8,threads=1,free=378278472,total=503316480
[junit4] 2> NOTE: All tests run in this JVM: [TestStandardAnalyzer]
{noformat}
I dug just a bit... looks like we are passing len=0 to
MockReaderWrapper.read(char[], int, int), which it can't handle (it calls
{{realLen = TestUtil.nextInt(random, 1, len);}}) ... I'm not sure why we don't
hit this on 4.x/trunk...
> Long text matching email local-part rule in UAX29URLEmailTokenizer causes
> extremely slow tokenization
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-5400
> URL: https://issues.apache.org/jira/browse/LUCENE-5400
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 4.5
> Reporter: Chris Geeringh
> Assignee: Steve Rowe
> Fix For: 4.10, 5.0, 4.9.1
>
>
> This is a pretty nasty bug, and causes the cluster to stop accepting updates.
> I'm not sure how to consistently reproduce it but I have done so numerous
> times. Switching to a whitespace tokenizer improved indexing speed, and I
> never got the issue again.
> I'm running a 4.6 Snapshot - I had issues with deadlocks with numerous
> versions of Solr, and have finally narrowed down the problem to this code,
> which affects many/all(?) versions of Solr.
> When the thread hits this issue it uses 100% CPU, restarting the node which
> has the error allows indexing to continue until hit again. Here is thread
> dump:
> http-bio-8080-exec-45 (201)
>
> org.apache.lucene.analysis.standard.UAX29URLEmailTokenizerImpl.getNextToken(UAX29URLEmailTokenizerImpl.java:4343)
>
> org.apache.lucene.analysis.standard.UAX29URLEmailTokenizer.incrementToken(UAX29URLEmailTokenizer.java:147)
>
> org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:82)
>
> org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:54)
>
> org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:174)
>
> org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
>
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
>
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1517)
>
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:217)
>
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:583)
>
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:719)
>
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:449)
>
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:89)
>
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:151)
>
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:131)
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221)
>
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:116)
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186)
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112)
>
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:158)
>
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:99)
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
>
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
>
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
> java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> java.lang.Thread.run(Unknown Source)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]