Hi,
thanks for your reply. Well, the following is my definition. And if I
understand correctly, if I create a request to search in the field "
**_txt_de*" or "*text_de*" or "*text_general*" it should work with the
following definition?
*<dynamicField name="*_txt_de" type="text_de" indexed="true"
stored="true"/> <fieldType name="text_de" class="solr.TextField"
positionIncrementGap="100"> <analyzer> <tokenizer
class="solr.StandardTokenizerFactory"/> <filter
class="solr.LowerCaseFilterFactory"/> <filter
class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_de.txt" format="snowball" /> <filter
class="solr.GermanNormalizationFilterFactory"/> <filter
class="solr.GermanLightStemFilterFactory"/> <filter
class="solr.BeiderMorseFilterFactory" nameType="GENERIC" ruleType="APPROX"
concat="true" languageSet="auto" /> </analyzer> </fieldType>*
*<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100" multiValued="true"> <analyzer
type="index"> <tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" /> <filter
class="solr.LowerCaseFilterFactory"/> <filter
class="solr.BeiderMorseFilterFactory" nameType="GENERIC" ruleType="APPROX"
concat="true" languageSet="auto" /> </analyzer> <analyzer
type="query"> <tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" /> <filter
class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/> <filter
class="solr.LowerCaseFilterFactory"/> <filter
class="solr.BeiderMorseFilterFactory" nameType="GENERIC" ruleType="APPROX"
concat="true" languageSet="auto" /> </analyzer> </fieldType>*
Thank you,
Christian
Am Di., 6. Juli 2021 um 18:39 Uhr schrieb Alexandre Rafalovitch <
[email protected]>:
> You have your fields indexing with a particular field definition. That
> field definition has an analysis and query pipelines (could be same).
> When you search against that field (either by default or explicitly),
> it will go through the associated pipeline. So, that's when it
> matches.
>
> I don't know if that's directly helpful, but I did a demo a while ago
> with searching in English against Thai text using phonetic matching.
> It is at:
> https://github.com/arafalov/solr-thai-test/blob/master/collection1/conf/schema.xml#L34-L55
>
> Regards,
> Alex.
> P.s. Remember that you can double-index the same text (with copyField)
> and the second (indexed/not-stored) copy can be processed much
> stricter or just differently; then you can search both fields but put
> different weights on the more strict one. So, "jones" will match/rank
> "Jones" first, and "johns" second.
>
> On Tue, 6 Jul 2021 at 10:59, Christian Havel <[email protected]>
> wrote:
> >
> > Hi,
> >
> > thanks a lot. And how should my request look like? Is the phonetic search
> > "activated" by a special "keyword" in the request?
> >
> >
> > Am Di., 29. Juni 2021 um 06:04 Uhr schrieb TK Solr <[email protected]>:
> >
> > > According to the javadoc
> > >
> > >
> https://lucene.apache.org/core/8_9_0/analyzers-phonetic/org/apache/lucene/analysis/phonetic/BeiderMorseFilterFactory.html
> > > BeiderMorseFilterFactory is supposed to be used after the
> > > StandardTokenizer.
> > >
> > > Most likely GermanNormalizationFilterFactory and
> > > GermanLightStemFilterFactory
> > > shouldn't be used with BeiderMorseFilterFactory. After stems are cut,
> > > stems'
> > > pronunciation can't be matched.
> > >
> > > On the other hand, if you just want to match the German word spelled
> using
> > > different standards (ß <-> ss), GermanNormalizationFilterFactory
> should be
> > > enough. You don't need BeiderMorseFilterFactory.
> > >
> > > p.s. I'm not a German speaker and I haven't actually tested the above
> > > claim. I'm
> > > just speculating.
> > >
> > >
> > > On 6/28/21 7:25 AM, Christian Havel wrote:
> > > > Hi,
> > > >
> > > > I am using Solr 8.8.1 and want to use the Phonetic Search option.
> Because
> > > > of this I modified my schema.xml file, rebuild the index.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > * <!-- German --> <dynamicField name="*_txt_de" type="text_de"
> > > > indexed="true" stored="true"/> <fieldType name="text_de"
> > > > class="solr.TextField" positionIncrementGap="100"> <analyzer>
> > > > <tokenizer class="solr.StandardTokenizerFactory"/> <filter
> > > > class="solr.LowerCaseFilterFactory"/> <filter
> > > > class="solr.StopFilterFactory" ignoreCase="true"
> > > > words="lang/stopwords_de.txt" format="snowball" /> <filter
> > > > class="solr.GermanNormalizationFilterFactory"/> <filter
> > > > class="solr.GermanLightStemFilterFactory"/> <filter
> > > > class="solr.BeiderMorseFilterFactory" nameType="GENERIC"
> > > ruleType="APPROX"
> > > > concat="true" languageSet="auto" /> <!-- less aggressive:
> <filter
> > > > class="solr.GermanMinimalStemFilterFactory"/> --> <!-- more
> > > > aggressive: <filter class="solr.SnowballPorterFilterFactory"
> > > > language="German2"/> --> </analyzer>*
> > > > </fieldType>
> > > >
> > > > Well I hope that searching for "mueller" finds contacts with
> "müller",
> > > too.
> > > > But it seems that it has no effect.
> > > > Do you have any idea what could be missing?
> > > >
> > > > Thanks,
> > > > Christian
> > > >
> > >
> > >
>