Please note that mapping characters works well for a small set of characters, but if you want full UNICODE normalization, take a look at the ICUFoldingFilter: https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-ICUFoldingFilter
--Ere elisabeth benoit kirjoitti 8.2.2019 klo 22.47: > yes you do > > and use the char filter at index and query time > > Le ven. 8 févr. 2019 à 19:20, SAUNIER Maxence <msaun...@q1c1.fr> a écrit : > >> For the charFilter, I need to reindex all documents ? >> >> -----Message d'origine----- >> De : Erick Erickson <erickerick...@gmail.com> >> Envoyé : vendredi 8 février 2019 18:03 >> À : solr-user <solr-user@lucene.apache.org> >> Objet : Re: Ignore accent in a request >> >> Elisabeth's suggestion is spot on for the accent. >> >> One other thing I noticed. You are using KeywordTokenizerFactory combined >> with EdgeNGramFilterFactory. This implies that you can't search for >> individual _words_, only prefix queries, i.e. >> je >> je s >> je su >> je sui >> je suis >> >> You can't search for "suis" for instance. >> >> basically this is an efficient way to search anything starting with >> three-or-more letter prefixes at the expense of index size. You might be >> better off just using wildcards (restrict to three letters at the prefix >> though). >> >> This is perfectly valid, I'm mostly asking if it's your intent. >> >> Best, >> Erick >> >> On Fri, Feb 8, 2019 at 9:35 AM SAUNIER Maxence <msaun...@q1c1.fr> wrote: >>> >>> Thanks you ! >>> >>> -----Message d'origine----- >>> De : elisabeth benoit <elisaelisael...@gmail.com> Envoyé : vendredi 8 >>> février 2019 14:12 À : solr-user@lucene.apache.org Objet : Re: Ignore >>> accent in a request >>> >>> Hello, >>> >>> We use solr 7 and use >>> >>> <charFilter class="solr.MappingCharFilterFactory" >>> mapping="mapping-ISOLatin1Accent.txt"/> >>> >>> with mapping-ISOLatin1Accent.txt >>> >>> containing lines like >>> >>> # À => A >>> "\u00C0" => "A" >>> >>> # Á => A >>> "\u00C1" => "A" >>> >>> # Â => A >>> "\u00C2" => "A" >>> >>> # Ã => A >>> "\u00C3" => "A" >>> >>> # Ä => A >>> "\u00C4" => "A" >>> >>> # Å => A >>> "\u00C5" => "A" >>> >>> # Ā Ă Ą => >>> "\u0100" => "A" >>> "\u0102" => "A" >>> "\u0104" => "A" >>> >>> # Æ => AE >>> "\u00C6" => "AE" >>> >>> # Ç => C >>> "\u00C7" => "C" >>> >>> # é => e >>> "\u00E9" => "e" >>> >>> Best regards, >>> Elisabeth >>> >>> Le ven. 8 févr. 2019 à 11:18, Gopesh Sharma <gopesh_sha...@gensler.com> >> a écrit : >>> >>>> We have fixed this type of issue by using Synonyms by adding >>>> SynonymFilterFactory(Before Solr 7). >>>> >>>> -----Original Message----- >>>> From: SAUNIER Maxence <msaun...@q1c1.fr> >>>> Sent: Friday, February 8, 2019 3:36 PM >>>> To: solr-user@lucene.apache.org >>>> Subject: RE: Ignore accent in a request >>>> >>>> Hello, >>>> >>>> Thanks for you answer. >>>> >>>> I have test : >>>> >>>> select?defType=dismax&q=je suis avarié&qf=content >>>> 90.000 results >>>> >>>> select?defType=dismax&q=je suis avarie&qf=content >>>> 60.000 results >>>> >>>> With avarié, I dont find documents with avarie and with avarie, I >>>> don't find documents with avarié. >>>> >>>> I want to find they 150.000 documents with avarié or avarie. >>>> >>>> Thanks >>>> >>>> -----Message d'origine----- >>>> De : Erick Erickson <erickerick...@gmail.com> Envoyé : jeudi 7 >>>> février >>>> 2019 19:37 À : solr-user <solr-user@lucene.apache.org> Objet : Re: >>>> Ignore accent in a request >>>> >>>> exactly _how_ is it "not working"? >>>> >>>> Try building your parameters _up_ rather than starting with a lot, e.g. >>>> select?defType=dismax&q=je suis avarié&qf=title ^^ assumes you >>>> expect a match on title. Then: >>>> select?defType=dismax&q=je suis avarié&qf=title subject >>>> >>>> etc. >>>> >>>> Because mm=757 looks really wrong. From the docs: >>>> Defines the minimum number of clauses that must match, regardless of >>>> how many clauses there are in total. >>>> >>>> edismax is used much more than dismax as it's more flexible, but >>>> that's not germane here. >>>> >>>> finally, try adding &debug=query to the url to see exactly how the >>>> query is parsed. >>>> >>>> Best, >>>> Erick >>>> >>>> On Mon, Feb 4, 2019 at 9:09 AM SAUNIER Maxence <msaun...@q1c1.fr> >> wrote: >>>>> >>>>> Hello, >>>>> >>>>> How can I ignore accent in the query result ? >>>>> >>>>> Request : >>>>> http://*****:8983/solr/***/select?defType=dismax&q=je+suis+avarié& >>>>> qf >>>>> =t >>>>> itle%5e20+subject%5e15+category%5e1+content%5e0.5&mm=757 >>>>> >>>>> I want to have doc with avarié and avarie. >>>>> >>>>> I have add this in my schema : >>>>> >>>>> { >>>>> "name": "string", >>>>> "positionIncrementGap": "100", >>>>> "analyzer": { >>>>> "filters": [ >>>>> { >>>>> "class": "solr.LowerCaseFilterFactory" >>>>> }, >>>>> { >>>>> "class": "solr.ASCIIFoldingFilterFactory" >>>>> }, >>>>> { >>>>> "class": "solr.EdgeNGramFilterFactory", >>>>> "minGramSize": "3", >>>>> "maxGramSize": "50" >>>>> } >>>>> ], >>>>> "tokenizer": { >>>>> "class": "solr.KeywordTokenizerFactory" >>>>> } >>>>> }, >>>>> "stored": true, >>>>> "indexed": true, >>>>> "sortMissingLast": true, >>>>> "class": "solr.TextField" >>>>> }, >>>>> >>>>> But it not working. >>>>> >>>>> Thanks. >>>> >> > -- Ere Maijala Kansalliskirjasto / The National Library of Finland