[
https://issues.apache.org/jira/browse/SOLR-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17986869#comment-17986869
]
ASF subversion and git services commented on SOLR-17726:
--------------------------------------------------------
Commit c790b54104ab38fb5b297bc43ffcd223e7eaf5f9 in solr's branch
refs/heads/branch_9x from Ilaria Petreti
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=c790b54104a ]
[SOLR-17726] Fix CloudMLTQParser to support copyField in qf (#3328)
using copy field source for more like this + tests
---------
Co-authored-by: Alessandro Benedetti <[email protected]>
(cherry picked from commit d249593e5affaa9795bc7c9c6e2218e31203eee4)
> CloudMLTQParser fails to use copyFields due to RealTime Get
> -----------------------------------------------------------
>
> Key: SOLR-17726
> URL: https://issues.apache.org/jira/browse/SOLR-17726
> Project: Solr
> Issue Type: Bug
> Components: MoreLikeThis
> Affects Versions: 9.8.1
> Reporter: ilariapet
> Priority: Major
> Labels: morelikethis, pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> When using CloudMLTQParser (the default MLT parser in SolrCloud), fields that
> are populated exclusively via copyField are not taken into account when
> constructing the MoreLikeThis query.
> This happens because CloudMLTQParser relies on a RealTime Get (`/get`)
> request to retrieve the source document by ID, and the document returned by
> RealTime Get does not include fields generated via copyField (i.e. are not
> part of the original SolrIDocument).
> As a result, even if the copyField target is stored and has proper
> termVectors configured, CloudMLTQParser skips the field silently, and the MLT
> query ends up empty.
>
> This behavior differs from SimpleMLTQParser (used in Solr standalone), which
> does not rely on RealTimeGet but instead extracts the stored field content
> and re-applies the analysis chain dynamically.
>
> *STEPS TO REPRODUCE*
> 1. Define these fields in the schema.xml:
> {code:java}
> <field name="description" type="text_general" indexed="true" stored="true"/>
> <field name="descriptionMLT" type="text_general_mlt" indexed="true"
> stored="true" termVectors="true"/>
> <copyField source="description" dest="descriptionMLT"/> {code}
> 2. Index a document that sets only the {{description}} field. The
> {{descriptionMLT}} field is expected to be populated automatically via the
> configured copyField directive.
> 3. ** Run an MLT query:
> {code:java}
> /select?q={!mlt qf=descriptionMLT}doc_id {code}
> 4. The resulting parsed query will be empty:
> {code:java}
> "parsedquery": "+() -documentId:32000"{code}
> If the same document is reindexed explicitly setting {{{}descriptionMLT{}}},
> the MLT query works.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]