You certainly can - just create your own Analyzer starting with a copy of the French one you are using.

Then you just plug in the filter in the order you want it applied:

result = new ISOLatin1AccentFilter(result);

You have to decide for yourself where it will come - if you put it before the stopword step, more stops words might be removed than if it was after - that type of thing usually comes down to individual requirements/filter limitations. If your stopword list has diacriticals and you run the accent filter before applying the stopword list, some expected stopwords will never be removed...etc.


Christophe from paris wrote:
Actualy in my FrenchAnalyser
i have :

 TokenStream result = new StandardTokenizer(reader);
    result = new StandardFilter(result);
    result = new StopFilter(result, stoptable);
    result = new FrenchStemFilter(result, excltable);
    result = new LowerCaseFilter(result);


I can use ISOLatin1AccentFilter in this Class for indexing ans search ?
And it is the case where ?


markrmiller wrote:
Check out org.apache.lucene.analysis.ISOLatin1AccentFilter

It will strip diacritics - just be sure to use it at index time and query time to get what you want. Also, you will no longer be able to differentiate between the two in your searching (rarely that important in my opinion, but others certainly disagree).

- Mark

Christophe from paris wrote:
Hello

I'm use FrenchAnalyzer for index
IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(),
true);
Document = new Document();
doc.add(new
Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED));
writer.addDocument(doc);

And search

IndexReader reader = IndexReader.open(pathOfIndex);                     
Searcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new FrenchAnalyzer();                                       
        
QueryParser parser = new QueryParser(field, analyzer);                          
        
Query query = parser.parse(motRecherche);
Hits hits = searcher.search(query);

in my document i have the word "lumiere" and "lumière"

when i search lumière only document match lumière but "lumiere" is not
return

and if search "lumiere" the result is lumiere, lumieres ,lumiére,lumiéres
but not lumière

for a total match i must search "lumiere OR limière"
but is not the best solution
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to