I add : "è" => "e" to mapping-ISOLatin1Accent.txt
and add the following fieldType: <fieldType name="textCharNorm" class="solr.TextField" positionIncrementGap="100" > <analyzer> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.CharStreamAwareWhitespaceTokenizerFactory"/> </analyzer> </fieldType> By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène)............ :,( aerox7 wrote: > > ==> where are you seeing it as ""Solène" as opposed to the > correct way of solène? > > I have "Solène" in my Mysql DATA BASE ! so i don't know if this is > correct or not ? i gess that "Solène" is solène in UTF-8 ?! > > I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so > when i try with solène everything is ok ! but when i try with Solène > (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne > !!! > > I think that ISOLatin1AccentFilterFactory take only string with Charset > ISO-8859-1 . > > So any solution to transform my string to ISO-8859-1 before indexing > process. May be by creating transformer in DataImportHandler ? (Never code > in java :( ) > > Thank you all. > > > Koji Sekiguchi-2 wrote: >> >> aerox7 wrote: >>> Hi, >>> I have a mysql data base in UTF-8. I have a row with "Solène" (solène). >>> I >>> want to transforme this to solene, so i use Solr >>> ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! >>> >>> i gess that "Solène" is "solène" in UTF-8 ?! i also set tomcat to utf-8 >>> so >>> normaly ISOLatin1AccentFilterFactory have to replace the accent ....... >>> >>> any ideas ? >>> >>> i use DataImportHandler. >>> >> >> If a mapping rule "è" to "e" is always true in your field, you can try >> to use MappingCharFilter >> instead of ISOLatin1AccentFilter. Add the following line to >> mapping-ISOLatin1Accent.txt: >> >> "è" => "e" >> >> and add the following fieldType: >> >> <fieldType name="textCharNorm" class="solr.TextField" >> positionIncrementGap="100" > >> <analyzer> >> <charFilter class="solr.MappingCharFilterFactory" >> mapping="mapping-ISOLatin1Accent.txt"/> >> <tokenizer class="solr.CharStreamAwareWhitespaceTokenizerFactory"/> >> </analyzer> >> </fieldType> >> >> MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. >> >> Koji >> >> >> >> > > -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com.