[ https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271577#comment-15271577 ]
Shikha Somani commented on SOLR-8297: ------------------------------------- To avoid confusion about this fix below is a comparison between how join worked in various versions: ||Solr 4.x||Solr 5.x| |Secondary collection can be well sharded. |Secondary collection should be singly sharded|| |​Secondary collection shard/replica should be present on each node where primary collection shards are |Secondary collection should be replicated on all nodes where primary is present| |Join query should have core name of both the collections|Join query should have only collection name and not core name. Specifying core name will throw exception| Because of the above mentioned differences Solr 5.x has lost backward compatibility for join queries. Making it *nearly impossible* to upgrade to Solr 5.x from Solr 4.x. The provided solution is adding backward compatibility for join queries with following conditions: * Single shard of both primary and secondary collection present on same node * Both primary and secondary collection should have same numShards and replicationFactor This fix is providing *backward compatibility* and is *not an enhancement* for above requirements. If required another defect can be opened for backward compatibility support and this fix can be part of the new defect. > Allow join query over 2 sharded collections: enhance functionality and > exception handling > ----------------------------------------------------------------------------------------- > > Key: SOLR-8297 > URL: https://issues.apache.org/jira/browse/SOLR-8297 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 5.3 > Reporter: Paul Blanchaert > > Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail > Khludnev. > A) exception handling: > The exception "SolrCloud join: multiple shards not yet supported" thrown in > the function findLocalReplicaForFromIndex of JoinQParserPlugin is not > triggered correctly: In my use-case, I've a join on a facet.query and when my > results are only found in 1 shard and the facet.query with the join is > querying the last replica of the last slice, then the exception is not thrown. > I believe it's better to verify the nr of slices when we want to verify the > "multiple shards not yet supported" exception (so exception is thrown when > zkController.getClusterState().getSlices(fromIndex).size()>1). > B) functional enhancement: > I would expect that there is no problem to perform a cross-core join over > sharded collections when the following conditions are met: > 1) both collections are sharded with the same replicationFactor and numShards > 2) router.field of the collections is set to the same "key-field" (collection > of "fromindex" has router.field = "from" field and collection joined to has > router.field = "to" field) > The router.field setup ensures that documents with the same "key-field" are > routed to the same node. > So the combination based on the "key-field" should always be available within > the same node. > From a user perspective, I believe these assumptions seem to be a "normal" > use-case in the cross-core join in SolrCloud. > Hope this helps -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org