On Jan 26, 2006, at 7:26 PM, arnaudbuffet wrote:
I do not find the ISOLatin1AccentFilter class in my lucene jar, but I find one on google attach to this mail, could you tell me if it is the good one?

This used to be in contrib/analyzers but has been moved into the core (Subversion only for now):

http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/ apache/lucene/analysis/

I do not see anything in this class which can help me. This program will replace some accent characters but my problem is:

if I try to index a text file encoded in Western 1252 for exemple with the Turkish text "düzenlediğimiz kampanyamıza" the lucene index will contain re encoded data with �k�� ....

Reading encoding files is your applications responsibility. You need to be sure to read the files in using the proper encoding. Once read properly into Java all will be well as far as Lucene indexing the characters.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to