Yonik Seeley wrote:
On 11/29/06, Antony Bowesman <[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
The GreekAnalyzer is just an example of how you can use existing
Analyzers (as long as they have a default constructor), but it's not
the recommended approach.
TokenFilters are preffered over Analyzers.... you can plug them
together in any way you see fit to solve your analysis problem. For
Solr, an added bonus of using chains of filters is that Solr can
"know" about the results after each filter and show you the results on
an analysis web page (very useful for debugging).
If I were to analyze greek text, I might do something like this:
<fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"/>
<filter class="solr.SnowballPorterFilterFactory"
language="Greek" />
xt"/>
</analyzer>
</fieldtype>
If you try to put everything in Analyzer constructors, you get
combinatorial explosion.
I guess you would use methods rather than, as you say, getting into constructor
hell. Anyway, I'll have a deeper look at the solr stuff when I get to phase 2.
Right now, I've gone as far with analysis as I need to, but I would like to
get better configuration than I've currently got. I know it will come back to
bite...
Thanks for your comments Yonik
Antony
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]