[ https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503807#comment-13503807 ]
Yonik Seeley edited comment on SOLR-2592 at 11/26/12 2:44 PM: -------------------------------------------------------------- Hi Michael, while I was reviewing your patch, I realized that we want something that clients (like cloud aware SolrJ) can easily get to. So instead of config in solrconfig.xml that defines a parser for the core, it seems like this should be a property of the collection? The HashPartitioner object is currently on the ClusterState object, but it really seems like this needs to be per-collection (or per config-set?). Currently, there is a /collections/collection1 node with {code} {"configName":"myconf"} {code} We could add a "partitioner" or "shardPartitioner" attribute. And of course there is /clusterstate.json with {code} {"collection1":{ "shard1":{ "range":"80000000-ffffffff", "replicas":{"192.168.1.109:8983_solr_collection1":{ "shard":"shard1", "roles":null, "state":"active", "core":"collection1", "collection":"collection1", "node_name":"192.168.1.109:8983_solr", "base_url":"http://192.168.1.109:8983/solr", "leader":"true"}}}, "shard2":{ "range":"0-7fffffff", "replicas":{}}}} {code} Now, currently the ClusterState object is created by reading clusterstate.json, but I don't believe it reads /collections/<collection_name>, but perhaps we should change this? The other issue is that there is currently no java Collection object (it's just a map of the contained shards) to expose info like what the hash partitioner used is. I think I'll start by trying to refactor some of the ClusterState code to introduce that. was (Author: ysee...@gmail.com): Hi Michael, while I was reviewing your patch, I realized that we want something that clients (like cloud aware SolrJ) can easily get to. So instead of config in solrconfig.xml that defines a parser for the core, it seems like this should be a property of the collection? The HashPartitioner object is currently on the ClusterState object, but it really seems like this needs to be per-collection (or per config-set?). Currently, there is a /collections/collection1 node with {code} {"configName":"myconf"} {code} We could add a "partitioner" or "shardPartitioner" attribute. And of course there is /clusterstate.json with {code} {"collection1":{ "shard1":{ "range":"80000000-ffffffff", "replicas":{"192.168.1.109:8983_solr_collection1":{ "shard":"shard1", "roles":null, "state":"active", "core":"collection1", "collection":"collection1", "node_name":"192.168.1.109:8983_solr", "base_url":"http://192.168.1.109:8983/solr", "leader":"true"}}}, "shard2":{ "range":"0-7fffffff", "replicas":{}}}} {code} Now, currently the ClusterState object is created by reading clusterstate.json, but I don't believe it reads /collections/<collection_name>, but perhaps we should change this? > Custom Hashing > -------------- > > Key: SOLR-2592 > URL: https://issues.apache.org/jira/browse/SOLR-2592 > Project: Solr > Issue Type: New Feature > Components: SolrCloud > Affects Versions: 4.0-ALPHA > Reporter: Noble Paul > Attachments: dbq_fix.patch, pluggable_sharding.patch, > pluggable_sharding_V2.patch, SOLR-2592.patch, SOLR-2592_r1373086.patch, > SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch, > SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch > > > If the data in a cloud can be partitioned on some criteria (say range, hash, > attribute value etc) It will be easy to narrow down the search to a smaller > subset of shards and in effect can achieve more efficient search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org