Analyzer thread safety; Stop words

2006-11-24 Thread Antony Bowesman
Two points about Analyzers: Does anyone have any experience with thread safety of Analyzer implementations. Apart from PerFieldAnalyzerWrapper, the analyzers seem to be thread safe, but is there a requirement that analyzers should be thread safe? Secondly, has anyone thought that it would be

Re: Analyzer thread safety; Stop words

2006-11-24 Thread Yonik Seeley
On 11/24/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: Two points about Analyzers: Does anyone have any experience with thread safety of Analyzer implementations. Apart from PerFieldAnalyzerWrapper, the analyzers seem to be thread safe, but is there a requirement that analyzers should be thre

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Antony Bowesman
Hi Yonik, Thanks for your comments. Secondly, has anyone thought that it would be a good idea to extend the Analyzer interface (Abstract class) to allow a standard way to set stop words? There seem to be two 'families' of stop word configuration via constructors. That belongs at the TokenF

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Yonik Seeley
On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: >> seem to be two 'families' of stop word configuration via constructors. > > That belongs at the TokenFilter level (where it currently is). That's true, but all the existing Analyzers allow the stop set to be configured via the analyzer co

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Antony Bowesman
Yonik Seeley wrote: On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: That's true, but all the existing Analyzers allow the stop set to be configured via the analyzer constructors, but in different ways. But you can duplicate most Analyzers (all the ones in Lucene?) with a chain of To

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Yonik Seeley
On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: Yonik Seeley wrote: > On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: >> >> That's true, but all the existing Analyzers allow the stop set to be >> configured >> via the analyzer constructors, but in different ways. > > But you can d

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Yonik Seeley
On 11/29/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: If I were to analyze greek text, I might do something like this: xt"/> Hmm, I just discovered that the Porter2 snowball stemmers don't support greek. Here is the relevant

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Antony Bowesman
Yonik Seeley wrote: On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote: Yonik Seeley wrote: The GreekAnalyzer is just an example of how you can use existing Analyzers (as long as they have a default constructor), but it's not the recommended approach. TokenFilters are preffered over Analy

Re: Analyzer thread safety; Stop words

2006-11-29 Thread Chris Hostetter
: Something seems confused to me. Although stop words are use by Filters, they : are currently exposed via Analyzers which is the granularity used at the : IndexWriter/Parser levels. This is what contributors are writing, not Filters. that's not really true .. if you look at the various contri