Re: English and French documents together / analysis, indexing, searching

[EMAIL PROTECTED] Thu, 20 Jan 2005 10:40:36 -0800

you could try to create a more complex query and expand it into both languages using different analyzers. Would this solve your problem ?

Would that mean I would have to actually conduct two searches (one in English and one in French) then merge the results and display them to the user? It sounds to me like a long way around, so then actually writing an analyzer that has the language guesser might be a better solution on the long run?

This is a behaviour is implemented in StandardTokenizer used by StandardAnalyzer. Look at the documentation of StandardTokenizer:

Many applications have specific tokenizer needs. If this tokenizer does not suit your application, please consider copying this source code directory to your project and maintaining your own grammar-based tokenizer.

Hmm I feel this is beyond my abilities at the moment, writing my own tokenizer, without more in-depth knowledge of everything else. Perhaps I'll try taking the StandardTokenizer and expand it or change it based on other tokenziers available in Lucene such as WhiteSpaceTokenizer.


thanks

-pedja


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: English and French documents together / analysis, indexing, searching

Reply via email to