[
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193937#comment-13193937
]
Robert Muir commented on LUCENE-3720:
-------------------------------------
I'm still working on a testcase for this.
I think the underlying commons-codec algorithm "blows up" on some inputs,
especially those from randomHtmlIshString.
Then given multiple threads its easy for it to OOM.
So far the best i have is: (ant test -Dtestcase=TestBeiderMorseFilter
-Dtestmethod=testOOM
-Dtests.seed=320e23c1f5dbf9c5:-766b86e7dc5e81df:148a8a4955f89b5e
-Dargs="-Dfile.encoding=UTF-8" -Dtests.heapsize=64M
This blows up, so I just have to get it to log the string that blows up I
think, and we have a start to a testcase.
{code}
public void testOOM() throws Exception {
int numIter = 100000;
int numTokens = 0;
for (int i = 0; i < numIter; i++) {
String s = _TestUtil.randomHtmlishString(random, 30);
TokenStream ts = analyzer.tokenStream("bogus", new StringReader(s));
ts.reset();
while (ts.incrementToken()) {
numTokens++;
}
ts.end();
ts.close();
}
System.out.println(numTokens);
}
{code}
> OOM in TestBeiderMorseFilter.testRandom
> ---------------------------------------
>
> Key: LUCENE-3720
> URL: https://issues.apache.org/jira/browse/LUCENE-3720
> Project: Lucene - Java
> Issue Type: Test
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Robert Muir
>
> This has been OOM'ing a lot... we should see why, its likely a real bug.
> ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom
> -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591
> -Dtests.multiplier=3 -Dargs="-Dfile.encoding=ISO8859-1"
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]