[
https://issues.apache.org/jira/browse/SOLR-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191976#comment-16191976
]
Varun Thacker commented on SOLR-11391:
--------------------------------------
Here is another datapoint with a test set to tune NUM_DOCS_THRESHOLD
Indexed 33M documents in a single shard collection
{{doc_type_s:X AND year:2007}} matches 240397 documents .
The following queries matches 673000 documents
{code}{!join to=join_key from=join_key cache=false}(doc_type_s:X AND
year:2007){code}
method=enum executes in 8 seconds and method=dv executes in 2 seconds .
> JoinQParser for non point fields should use the GraphTermsCollector
> --------------------------------------------------------------------
>
> Key: SOLR-11391
> URL: https://issues.apache.org/jira/browse/SOLR-11391
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Varun Thacker
> Attachments: SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch,
> SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch,
> SOLR-11391.patch, SOLR-11391.patch
>
>
> The Join Query Parser uses the GraphPointsCollector for point fields.
> For non point fields if we use the GraphTermsCollector instead of the current
> algorithm I am seeing quite a bit of performance gains.
> I'm going to attach a quick patch which I cooked up , making sure TestJoin
> and TestCloudJSONFacetJoinDomain passed.
> More tests, benchmarking and code cleanup to follow
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]