[ 
https://issues.apache.org/jira/browse/SOLR-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537852#comment-17537852
 ] 

Edward Ribeiro commented on SOLR-15977:
---------------------------------------

There are two analyzers defined for Portuguese language in your schema.

The analyzer below (in *text_br's* fieldType) applies stemming filter *before* 
applying the synonym filter. I think this will make it very hard to find the 
synonyms.

Switch the positions of *BrazilianStemFilterFactory* and 
*SynonymGraphFilterFactory* in the analyzer. ** Also, it uses a file 
({*}synonyms.txt{*}) that is used by many other field types in the schema. Is 
this right? Does it have Portuguese words? Also, wouldn't be the case of apply 
a lower case filter first if the words in the file are all lower case too? 

 
{code:java}
<analyzer>
   <tokenizer class="solr.StandardTokenizerFactory"/>
   <filter class="solr.BrazilianStemFilterFactory"/>
   <filter class="solr.SynonymGraphFilterFactory" expand="true" 
ignoreCase="true" synonyms="synonyms.txt"/>
</analyzer>{code}
The other analyzer (in *text_pt's* fieldType) uses a different synonyms file 
({*}synonyms_pt.txt{*}) and also applies a couple of transformations like lower 
casing the text and removing stops words. You should make sure the synonyms in 
the file are lower case here. OTOH, this one applies the stemming filter 
*after* the synonym filter, like most analyzers in the schema do.

 

 
{code:java}
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" format="snowball" 
words="lang/stopwords_pt.txt" ignoreCase="true"/>
<filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" 
synonyms="synonyms_pt.txt"/>
<filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/>
</analyzer>{code}
 

> Solr - portuguese synonyms are not working
> ------------------------------------------
>
>                 Key: SOLR-15977
>                 URL: https://issues.apache.org/jira/browse/SOLR-15977
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 8.11.1
>            Reporter: Nagesh
>            Priority: Blocker
>         Attachments: managed-schema
>
>
> Hi All,
> I have problem with portuguese synonyms. The queries with english synonyms 
> work fine. I have problem only with portuguese synonyms, they are not 
> working, solr ignores them. I cannot manage the problem myself. I have 
> attached schema file. Please check.
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to