That's exactly the way I do it when I have to write some custom stuff. My problem is that I do not know how to integrate an Analyzer's reusability-feature into a Unit-Test to see what happens if - i.e. - a TokenFilter-instance is going to be reused.
Some TokenFilter-prototypes I've seen are stateful and do not "reset" their state as neccessary in order to be reused. This problem only occurs when I deploy those Filters to Solr and index or search for some documents (which does not always calls create() on the TokenFilterFactory). However I have to be able - at least somehow - to tackle those problems in Unit-Tests instead of noticing such problems after a deployment to Solr. So my question is: How can I (Unit-)test a TokenFilter with an Analyzer which reuses the same TokenFilter instance for more than one Input-TokenStream? Kind regards, Em Am 01.10.2012 19:43, schrieb Mikhail Khludnev: > It's not clear what you want to achieve. I don't always create custom > TokenStreams, but if I do I use Lucenes as a prototype to start from. > > On Mon, Oct 1, 2012 at 6:07 PM, Em <mailformailingli...@yahoo.de> wrote: > >> Hi Mikhail, >> >> thanks for your feedback. >> >> If so, how can I write UnitTests which respect the Reuse strategy? >> What's the recommended way when creating custom Tokenizers and >> TokenFilters? >> >> Kind regards, >> Em >> >> Am 01.10.2012 10:54, schrieb Mikhail Khludnev: >>> Hello, >>> >>> Analyzers are reused. Analyzer is Tokenizer and several TokenFilters. >> Check >>> the source org.apache.lucene.analysis.Analyzer, pay attention to >>> reuseStrategy. >>> >>> Best regards >>> >>> On Sun, Sep 30, 2012 at 5:37 PM, Em <mailformailingli...@yahoo.de> >> wrote: >>> >>>> Hello list, >>>> >>>> I saw a bug in a TokenFilter that only works, if there is a fresh >>>> instance created by the TokenFilterFactory and it seems as TokenFilters >>>> are reused some how for more than one request. >>>> >>>> So, if your TokenFilterFactory has a Logging-Statement in its >>>> create()-method, you see that log only now and again - but not on every >>>> request. >>>> >>>> Is this a bug in Solr 4.0-BETA or is this expected behaviour? >>>> If it is expected, what could be wrong with the TokenFilter? >>>> >>>> Kind regards, >>>> Em >>>> >>> >>> >>> >> > > >