On Wed, Jun 13, 2012 at 4:38 PM, Benson Margulies <bimargul...@gmail.com> wrote:
>
> Does this suggest anything to anyone? Other than that we've
> misanalyzed the logic in the tokenizer and there's a way to make it
> burp on one thread?

it might suggest the different tokenstream instances refer to some
shared object that is not thread safe: we had bugs like this before
(e.g. sharing a JDK collator is ok, but ICU ones are not thread-safe,
so you must clone them).

Because of this we beefed up our base analysis class
(http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/test-framework/src/java/org/apache/lucene/analysis/BaseTokenStreamTestCase.java)
to find thread safety bugs like this.

I recommend just grabbing the test-framework.jar (we release it as an
artifact), extend that class and write a test like:
  public void testRandomStrings() throws Exception {
    checkRandomData(random, analyzer, 100000);
  }

(or use the one in the branch, its even been improved since 3.6)

-- 
lucidimagination.com

Reply via email to