[ 
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456587#comment-13456587
 ] 

Yonik Seeley commented on SOLR-2592:
------------------------------------

This issue should be about "custom hashing".  "Custom sharding" is different, 
and covers many other things such as time-based sharding.
I haven't had a chance to look at the patches here, but it seems like simpler 
may be better.
I like the approach of just hashing on the id field.

For example, if you have an email search application and want to co-locate all 
a users data on the same shard, then simply use id's of the form
userid:emailid  (or whatever separator we choose - we should chose one by 
default that is less likely to accidentally clash with normal ids, and also one 
that works well in URLs and hopefully in query parser syntax w/o the need for 
escaping).

And then when you hash, you simply use the upper bits of the userid and the 
lower bits of the emailid to construct the hash that selects the node 
placement.  The only real configurable part you need is where the split is 
(i.e. how many bits for each side).

                
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
>                 Key: SOLR-2592
>                 URL: https://issues.apache.org/jira/browse/SOLR-2592
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>    Affects Versions: 4.0-ALPHA
>            Reporter: Noble Paul
>            Assignee: Mark Miller
>         Attachments: dbq_fix.patch, pluggable_sharding.patch, 
> pluggable_sharding_V2.patch, SOLR-2592.patch, SOLR-2592_r1373086.patch, 
> SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch, 
> SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash, 
> attribute value etc) It will be easy to narrow down the search to a smaller 
> subset of shards and in effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to