I'm trying to build some web search tool that could work for multiple
languages. I understand that Lucene is shipped with StandardAnalyzer plus
a German and Russian analyzers and some more in the sandbox. And that
indexing and searching should use the same analyzer.
Now let's said I have an
On Thursday 20 January 2005 21:08, aurora wrote:
Now let's said I have an index with documents in multiple languages and
analyzed by an assortment of analyzers. When user enter a query, what
analyzer should be used?
Use q1 OR q2, where q1 is the query parsed with the analyzer for language
Hi Aurora
I develop a tool with this multiple languages issue. I found very useful
an nuke library language-identifier. This jar have nuke dependencies,
but I delete all unnecessary code (for me obvious).
This language-identifier that I use work fine and is very simple:
For example: