Hi Annika,

Can you please share a sample query and how it is being expanded.
Also, share how you expect it to be expanded.
It would help to replicate your scenario and understand the problem better.

Best Regards,
Atin Janki


On Tue, Mar 5, 2024 at 4:21 PM elisabeth benoit <[email protected]>
wrote:

> Hello Annika,
>
> For multiwords synonyms, we have been using
> https://github.com/healthonnet/hon-lucene-synonyms jar, that we just
> rebuild with solr 9.2.1 (a modification is needed, if you ever need
> details).
>
> It overrides edismax query parser and expands multiwords synonyms at query
> time.
>
> We didnt want to expand synonyms at index time cause we had this problem:
>
> in the index: mairie
> synonym: hotel de ville
>
> and then at query time, with query 'hotel', mairie would match.
>
> With hon-lucene, when user asks for "hotel de ville", we match with mairie,
> but "hotel" doesnt match with mairie.
>
> You might have performance issues with hon-lucene if you have hundred of
> synonyms. But it's worth testing.
>
> Best regards,
> Elisabeth
>
> Le lun. 4 mars 2024 à 17:16, Mikhail Khludnev <[email protected]> a écrit :
>
> > Hello Annika,
> > You may use SolrAdmin/Analysys page, debugQuery and explainOther params
> to
> > dig into particular case. It's usually tough.
> >  I've found one clue in the ref guide:
> >  To get fully correct positional queries when your synonym replacements
> are
> > multiple tokens, you should instead apply synonyms using this filter at
> > query time.
> > Probably you may start from something simple.
> >
> > On Mon, Mar 4, 2024 at 5:23 PM Annika Gable
> > <[email protected]> wrote:
> >
> > > Hello,
> > >
> > > I'm using Solr 9.1, and I'm trying to set up synonyms. I managed to get
> > > synonyms to work for single-word synonyms, but not for multiword and
> > > hyphenated synonyms.
> > >
> > > In the final state, I am planning on having a very extensive synonym
> file
> > > (hundreds, if not thousands of lines) because I want to always find
> > results
> > > for all child terms and other synonyms of a given search term. This is
> > why
> > > I thought it may make sense to list all synonyms in the index. But
> > getting
> > > it to work with query-time synonym expansion would also be great
> already.
> > >
> > > For now, I am testing with equivalent synonyms. I am always querying
> > using
> > > quotation marks around the multi-word query.
> > >
> > > What I have tried:
> > > 1. I included sow=false in the query as recommended here
> > >
> > >
> >
> https://lucidworks.com/post/multi-word-synonyms-solr-adds-query-time-support/
> > >
> > > 2. I used the SynonymGraphFilter either only at query time, or at index
> > > time, or both -> I got the same number of results when querying
> > single-word
> > > synonyms, as expected (e.g. TIGIT, domvanalimab), but querying
> multi-word
> > > synonyms did not find the other synonyms correctly.
> > > 3. I made all text fields into a text_field (which uses the
> > > KeywordTokenizer) instead of text_general (which uses the
> > > StandardTokenizer), in order to prevent splitting up multi-word
> queries.
> > ->
> > > This still did not make multiword-synonyms work.
> > >
> > >
> > > My country-synonyms.txt file looks like this:
> > >
> > > TIGIT, domvanalimab, COM902, BMS-986207, Anti-TIGIT Antibody
> > > immuno-oncology, immunooncology
> > > Afghanistan, AF, AFG
> > > Albania, AL, ALB
> > >
> > >
> > > And the relevant query fields from my schema.xml look like this, with
> > > text_general being the fieldtype of the catchall field
> > >
> > > <fieldType name="text_field" class="solr.TextField"
> > > positionIncrementGap="100">
> > >     <analyzer type="index">
> > >        <tokenizer class="solr.KeywordTokenizerFactory" />
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >        <filter class="solr.SynonymGraphFilterFactory"
> > > synonyms="country-synonyms.txt" ignoreCase="true" expand="true"/>
> > >        <filter class="solr.FlattenGraphFilterFactory"/>
> > >     </analyzer>
> > >     <analyzer type="query">
> > >        <tokenizer class="solr.KeywordTokenizerFactory" />
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >        <filter class="solr.SynonymGraphFilterFactory"
> > > synonyms="country-synonyms.txt" ignoreCase="true" expand="true"/>
> > >     </analyzer>
> > > </fieldType>
> > > <fieldType name="text_general" class="solr.TextField"
> > > positionIncrementGap="100">
> > >     <analyzer type="index">
> > >        <tokenizer class="solr.StandardTokenizerFactory" />
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >        <filter class="solr.SnowballPorterFilterFactory"
> > language="English"
> > > />
> > >        <filter class="solr.SynonymGraphFilterFactory"
> > > synonyms="country-synonyms.txt" ignoreCase="true" expand="true"/>
> > >        <filter class="solr.FlattenGraphFilterFactory"/>
> > >     </analyzer>
> > >     <analyzer type="query">
> > >        <tokenizer class="solr.StandardTokenizerFactory" />
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >        <filter class="solr.SnowballPorterFilterFactory"
> > language="English"
> > > />
> > >        <filter class="solr.SynonymGraphFilterFactory"
> > > synonyms="country-synonyms.txt" ignoreCase="true" expand="true"/>
> > >     </analyzer>
> > > </fieldType>
> > >
> > >
> > > Any hints would be appreciated!
> > >
> > > --
> > > PRIVILEGED AND CONFIDENTIAL
> > > PLEASE NOTE: The information contained in this
> > > message is privileged and confidential, and is intended only for the
> use
> > > of
> > > the individual to whom it is addressed and others who have been
> > > specifically authorized to receive it. If you are not the intended
> > > recipient, you are hereby notified that any dissemination, distribution
> > or
> > > copying of this communication is strictly prohibited. If you have
> > received
> > > this communication in error, or if any problems occur with
> transmission,
> > > please contact the sender and kindly delete any copies of this
> > > communication. Thank you.
> > >
> > >
> > >
> > >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>

Reply via email to