[
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456280#comment-13456280
]
Michael Garski commented on SOLR-2592:
--------------------------------------
Dan, perhaps you could just reverse the composite id portions and use a filter
query to restrict your query to a subset of the data in the shard. Using your
example you would have unique ids doc=Sep2012_1234 and doc=Oct2012_4567, with
each document containing a field with the values Sep2012 and Oct2012
respectively. In that way, if you started with a small amount of shards and
both of those documents wound up on the same shard the search would be
restricted to just the shard where they reside, and the filter would be applied
to only include the docs you want. In the patch I submitted, if you only wanted
to query the documents from Sep2012 you would add the parameter
shard.keys=Sep2012 and the appropriate filter query. I take that exact approach
with user-specific data to ensure all docs for a specific user reside in the
same shard and the query is only executed against that shard and returns docs
for only that user.
The downside with that approach for your use case would be with a potentially
low number of unique values that are hashed you could wind up with an uneven
distribution of data across the shards.
It sounds like what you really want is a date-based distribution policy that
will add new shards to the collection each month. Does that sound about right?
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
> Key: SOLR-2592
> URL: https://issues.apache.org/jira/browse/SOLR-2592
> Project: Solr
> Issue Type: New Feature
> Components: SolrCloud
> Affects Versions: 4.0-ALPHA
> Reporter: Noble Paul
> Assignee: Mark Miller
> Attachments: dbq_fix.patch, pluggable_sharding.patch,
> pluggable_sharding_V2.patch, SOLR-2592.patch, SOLR-2592_r1373086.patch,
> SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch,
> SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash,
> attribute value etc) It will be easy to narrow down the search to a smaller
> subset of shards and in effect can achieve more efficient search.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]