[ 
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456280#comment-13456280
 ] 

Michael Garski commented on SOLR-2592:
--------------------------------------

Dan, perhaps you could just reverse the composite id portions and use a filter 
query to restrict your query to a subset of the data in the shard. Using your 
example you would have unique ids doc=Sep2012_1234 and doc=Oct2012_4567, with 
each document containing a field with the values Sep2012 and Oct2012 
respectively. In that way, if you started with a small amount of shards and 
both of those documents wound up on the same shard the search would be 
restricted to just the shard where they reside, and the filter would be applied 
to only include the docs you want. In the patch I submitted, if you only wanted 
to query the documents from Sep2012 you would add the parameter 
shard.keys=Sep2012 and the appropriate filter query. I take that exact approach 
with user-specific data to ensure all docs for a specific user reside in the 
same shard and the query is only executed against that shard and returns docs 
for only that user.

The downside with that approach for your use case would be with a potentially 
low number of unique values that are hashed you could wind up with an uneven 
distribution of data across the shards. 

It sounds like what you really want is a date-based distribution policy that 
will add new shards to the collection each month. Does that sound about right?
                
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
>                 Key: SOLR-2592
>                 URL: https://issues.apache.org/jira/browse/SOLR-2592
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>    Affects Versions: 4.0-ALPHA
>            Reporter: Noble Paul
>            Assignee: Mark Miller
>         Attachments: dbq_fix.patch, pluggable_sharding.patch, 
> pluggable_sharding_V2.patch, SOLR-2592.patch, SOLR-2592_r1373086.patch, 
> SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch, 
> SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash, 
> attribute value etc) It will be easy to narrow down the search to a smaller 
> subset of shards and in effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to