On Jan 26, 2006, at 7:26 PM, arnaudbuffet wrote:
I do not find the ISOLatin1AccentFilter class in my lucene jar, but
I find one on google attach to this mail, could you tell me if it
is the good one?
This used to be in contrib/analyzers but has been moved into the core
(Subversion only for now):
http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/
apache/lucene/analysis/
I do not see anything in this class which can help me. This program
will replace some accent characters but my problem is:
if I try to index a text file encoded in Western 1252 for exemple
with the Turkish text "düzenlediğimiz kampanyamıza" the lucene
index will contain re encoded data with �k�� ....
Reading encoding files is your applications responsibility. You need
to be sure to read the files in using the proper encoding. Once read
properly into Java all will be well as far as Lucene indexing the
characters.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]