Folks: Question regarding SolrCloud Shard Number (Ex: shard<x>) & associated hash ranges. We are in the process of identifying the best strategy to merge shards that belong to collections that are chronologically older which sees very low volume of searches compared to the collections with most recent data.
What we ran into is that often times we find that Shard numbers are hash ranges don’t necessarily correlate: shard1: 80000000-aaa9ffff shard2: aaaa0000-d554ffff shard3: d5550000-ffffffff ( holds the last range ) shard4: 0-2aa9ffff ( holds the starting range ) shard5: 2aaa0000-5554ffff shard6: 55550000-7fffffff same goes for 'core_node<x>’ that does not follow order neither it correlates with shard<x>. Meaning core_node<1> does not contain the keys starting from 0 nor does it map to shard<1>. {"shard1"=> {"range"=>"80000000-aaa9ffff", {"core_node5"=> "core"=>"post_NW_201508_shard1_replica1", "shard2"=> {"range"=>"aaaa0000-d554ffff", {"core_node6"=> "core"=>"post_NW_201508_shard2_replica1", "shard3"=> {"range"=>"d5550000-ffffffff", {"core_node2"=> "core"=>"post_NW_201508_shard3_replica1", "shard4"=> {"range"=>"0-2aa9ffff", {"core_node3"=> "core"=>"post_NW_201508_shard4_replica1", "shard5"=> {"range"=>"2aaa0000-5554ffff", {"core_node4"=> "core"=>"post_NW_201508_shard5_replica1", "shard6"=> {"range"=>"55550000-7fffffff", {"core_node1"=> "core"=>"post_NW_201508_shard6_replica1" Why would this be a concern ? 1. Lets say if we merge the indexes of adjacent shards (to reduce the number of shards in the collection). In this case it will be merging "core_node3: 0-2aa9ffff” & "core_node4: 2aaa0000-5554ffff” . What would the index of the new core_node directory ? core_node<?> 2. When we copy this data over to the cluster after recreating the collection with reduced number of shards, how would the cluster infer the hash range from the index data or how does it reconcile with the metadata about the shards in the local filesystem of cluster nodes. 3. How should we approach this problem to guarantee Solr picks up the right key order from the merged indexes ? *Solr 4.4* *HDFS for Index Storage*