On Wed, Jun 13, 2012 at 4:38 PM, Benson Margulies bimargul...@gmail.com wrote:
Does this suggest anything to anyone? Other than that we've
misanalyzed the logic in the tokenizer and there's a way to make it
burp on one thread?
it might suggest the different tokenstream instances refer to some
shared object that is not thread safe: we had bugs like this before
(e.g. sharing a JDK collator is ok, but ICU ones are not thread-safe,
so you must clone them).
Because of this we beefed up our base analysis class
(http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/test-framework/src/java/org/apache/lucene/analysis/BaseTokenStreamTestCase.java)
to find thread safety bugs like this.
I recommend just grabbing the test-framework.jar (we release it as an
artifact), extend that class and write a test like:
public void testRandomStrings() throws Exception {
checkRandomData(random, analyzer, 10);
}
(or use the one in the branch, its even been improved since 3.6)
--
lucidimagination.com