Yes, you could do that. I guess numbers will give you trouble under all circumstances.
You may be able to do something like search against your non- phonetic field with higher boosts to preferentially do those matches. Best Erick On Tue, Feb 7, 2012 at 2:30 PM, Dirk Högemann <dirk.hoegem...@googlemail.com> wrote: > Thanks Erick. > In the first place we thought of removing numbers with a pattern filter. > Setting inject to false will have the "same" effect > If we want to be able to search for numbers in the content this solution > will not work,but another field without phonetic filtering and searching in > both fields would be ok,right? > > Dirk > Am 07.02.2012 14:01 schrieb "Erick Erickson" <erickerick...@gmail.com>: > >> What happens if you do NOT inject? Setting inject="false" >> stores only the phonetic reduction, not the original text. In that >> case your false match on "13" would go away.... >> >> Not sure what that means for the rest of your app though. >> >> Best >> Erick >> >> On Mon, Feb 6, 2012 at 5:44 AM, Dirk Högemann >> <dirk.hoegem...@googlemail.com> wrote: >> > Hi, >> > >> > I have a question on phonetic search and matching in solr. >> > In our application all the content of an article is written to a >> full-text >> > search field, which provides stemming and a phonetic filter (cologne >> > phonetic for german). >> > This is the relevant part of the configuration for the index analyzer >> > (search is analogous): >> > >> > <tokenizer class="solr.StandardTokenizerFactory"/> >> > <filter class="solr.WordDelimiterFilterFactory" >> > generateWordParts="1" generateNumberParts="1" catenateWords="0" >> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> >> > <filter class="solr.LowerCaseFilterFactory"/> >> > <filter class="solr.SnowballPorterFilterFactory" >> language="German2" >> > /> >> > <filter class="solr.PhoneticFilterFactory" >> > encoder="ColognePhonetic" inject="true"/> >> > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> >> > >> > Unfortunately this results sometimes in strange, but also explainable, >> > matches. >> > For example: >> > >> > Content field indexes the following String: Donnerstag von 13 bis 17 Uhr. >> > >> > This results in a match, if we search for "puf" as the result of the >> > phonetic filter for this is 13. >> > (As a consequence the 13 is then also highlighted) >> > >> > Does anyone has an idea how to handle this in a reasonable way that a >> > search for "puf" does not match 13 in the content? >> > >> > Thanks in advance! >> > >> > Dirk >>