[ 
https://issues.apache.org/jira/browse/SOLR-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17726:
----------------------------------
    Labels: morelikethis pull-request-available  (was: morelikethis)

> CloudMLTQParser fails to use copyFields due to RealTime Get
> -----------------------------------------------------------
>
>                 Key: SOLR-17726
>                 URL: https://issues.apache.org/jira/browse/SOLR-17726
>             Project: Solr
>          Issue Type: Bug
>          Components: MoreLikeThis
>    Affects Versions: 9.8.1
>            Reporter: ilariapet
>            Priority: Major
>              Labels: morelikethis, pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When using CloudMLTQParser (the default MLT parser in SolrCloud), fields that 
> are populated exclusively via copyField are not taken into account when 
> constructing the MoreLikeThis query.
> This happens because CloudMLTQParser relies on a RealTime Get (`/get`) 
> request to retrieve the source document by ID, and the document returned by 
> RealTime Get does not include fields generated via copyField (i.e. are not 
> part of the original SolrIDocument).
> As a result, even if the copyField target is stored and has proper 
> termVectors configured, CloudMLTQParser skips the field silently, and the MLT 
> query ends up empty.
>  
> This behavior differs from SimpleMLTQParser (used in Solr standalone), which 
> does not rely on RealTimeGet but instead extracts the stored field content 
> and re-applies the analysis chain dynamically.
>  
> *STEPS TO REPRODUCE*
> 1. Define these fields in the schema.xml:
> {code:java}
> <field name="description" type="text_general" indexed="true" stored="true"/>
> <field name="descriptionMLT" type="text_general_mlt" indexed="true" 
> stored="true" termVectors="true"/>
> <copyField source="description" dest="descriptionMLT"/> {code}
> 2. Index a document that sets only the {{description}} field. The 
> {{descriptionMLT}} field is expected to be populated automatically via the 
> configured copyField directive.
> 3. ** Run an MLT query:
> {code:java}
> /select?q={!mlt qf=descriptionMLT}doc_id {code}
> 4. The resulting parsed query will be empty:
> {code:java}
> "parsedquery": "+() -documentId:32000"{code}
> If the same document is reindexed explicitly setting {{{}descriptionMLT{}}}, 
> the MLT query works.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to