i think the easiest way ist to use Lucene's StandardAnalyzer. If you want to use the snowball stemmers, you have to add a language guesser to get the language for the particular document before creating the analyzer.

regards
Bernhard

[EMAIL PROTECTED] schrieb:

Greetings everyone

I wonder is there a solution for analyzing both English and French documents using the same analyzer.
Reason being is that we have predominantly English documents but there are some French, yet it all has to go into the same index
and be searchable from the same location during any perticular search. Is there a way to analyze both types of documents with
a same analyzer (and which one)?


I've looked around and I see there's a SnowBall analyzer but you have to specify the language of analysis, and I do not know that
ahead of time during indexing nor do I know it most of the time during searching (users would like to search in both document types).


There's also the issue of letter accents in french words and searching for the same (how are they indexed at the first place even)?
Has anyone dealt with this before and how did you solve the problem?


thanks

-pedja



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to