Dear all, We are using a Unified Analyzer as the analyzer of Lucene so as to be able to index and search Arabic and English documents as well.
Here is the code: public TokenStream tokenStream(String FieldName, Reader reader) { switch(analysisMode) { case UNIFIED: return new ExactTokensContructorFilter( new SnowballFilter( new ArabicStemmer( new ExactTokensSpecifierFilter( getStandardAnalyzerStream( reader)), false,false) ,latinLanguage)); case EXACT: return new ExactTokensContructorFilter( new ExactTokensSpecifierFilter( getStandardAnalyzerStream( reader))); } return null; } But the problem is that the results of the morphological search in English and Arabic are not good, for example: The data in which I search contains "test", "testing" and "tested", then when I search for "testing", it doesn't give "test" in the search results, although that when I traced it I found that the tokens of "testing" contains "test". But when I search for "manage", it gives me "management" in the search results which is correct. So what's the difference between both cases? Beside that I tried to use only the Snowball Analyzer instead of the Unified Analyzer and apply the same test but this time it gives correct and good results!! So can anyone help, why using Unified Analyzer affects the results? Note: latinLanguage in the above code = "English" Thanks & Best Regards, ------------------------------------ Shaimaa Mohamed Team Leader ICT Department Bibliotheca Alexandrina P.O. Box 138, Chatby Alexandria 21526, Egypt Tel: +(203) 483 9999, Ext:1418 Fax: +(203) 482 0405 Email: [EMAIL PROTECTED] <BLOCKED::mailto:[EMAIL PROTECTED]> Web Site: www.bibalex.org <blocked::http://www.bibalex.org>