Re: Shard splitting and replica placement strategy
Opened https://issues.apache.org/jira/browse/SOLR-8728 with a test which reproduces the exception. On Wed, Feb 24, 2016 at 3:49 PM Shai Ererawrote: > Thanks Noble, I'll try to reproduce in a test then. Does the rule I've set > sound right to you though? > > On Wed, Feb 24, 2016, 15:19 Noble Paul wrote: > >> Whatever it is , there should be no NPE. could be a bug >> >> On Wed, Feb 24, 2016 at 6:23 PM, Shai Erera wrote: >> > Hi >> > >> > I wanted to try out the (relatively) new replica placement strategy and >> how >> > it plays with shard splitting. So I set up a 4-node cluster, created a >> > collection with 1 shard and 2 replicas (each created on a different) >> node. >> > >> > When I issue a SPLITSHARD command (without any rules set on the >> collection), >> > the split finishes successfully and the state of the cluster is: >> > >> > n1: s1_r1 (INACTIVE), s1_0_r1, s1_1_r1 >> > n2: s1_r2 (INACTIVE), s1_0_r2 >> > n3: s1_1_r2 >> > n4: empty >> > >> > So far as expected, since the shard splitting occurred on n1, the two >> sub >> > shards were created there, and then Solr filled the missing replicas on >> > nodes 2 and 3. Also the source shard s1 was set to INACTIVE and I did >> not >> > delete it (in the test). >> > >> > Then I tried the same, curious if I set the right rule, one of the >> > sub-shards' replicas will move to the 4th node, so I end up w/ a >> "balanced" >> > cluster. So I created the collection with the rule: >> > "shard:**,replica:<2,node:*", which per the ref guide says that I >> should end >> > with no more than one replica per shard on every node. Per my >> understanding, >> > I should end up with either 2 nodes each holding one replica of each >> shard, >> > 3 nodes holding a mixture of replicas or 4 nodes each holds exactly one >> > replica. >> > >> > However, while observing the cluster status I noticed that the two >> created >> > sub-shards are marked as ACTIVE and leader, while the two others are >> marked >> > in DOWN. Turning on INFO logging I found this: >> > >> > Caused by: java.lang.NullPointerException at >> > >> org.apache.solr.cloud.rule.Rule.getNumberOfNodesWithSameTagVal(Rule.java:168) >> > at org.apache.solr.cloud.rule.Rule.tryAssignNodeToShard(Rule.java:130) >> at >> > >> org.apache.solr.cloud.rule.ReplicaAssigner.tryAPermutationOfRules(ReplicaAssigner.java:252) >> > at >> > >> org.apache.solr.cloud.rule.ReplicaAssigner.tryAllPermutations(ReplicaAssigner.java:203) >> > at >> > >> org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings0(ReplicaAssigner.java:174) >> > at >> > >> org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings(ReplicaAssigner.java:135) >> > at org.apache.solr.cloud.Assign.getNodesViaRules(Assign.java:211) at >> > org.apache.solr.cloud.Assign.getNodesForNewReplicas(Assign.java:179) at >> > >> org.apache.solr.cloud.OverseerCollectionMessageHandler.addReplica(OverseerCollectionMessageHandler.java:2204) >> > at >> > >> org.apache.solr.cloud.OverseerCollectionMessageHandler.splitShard(OverseerCollectionMessageHandler.java:1212) >> > >> > I also tried with the rule "replica:<2,node:*" which yielded the same >> NPE. I >> > run on 5.4.1 and I couldn't find if this is something that was fixed in >> > 5.5.0/master already. So the question is -- is this a bug or did I >> > misconfigure the rule? >> > >> > And as a side question, is there any rule which I can configure so that >> the >> > split shards are distributed evenly in the cluster? Or currently >> SPLITSHARD >> > will always result in the created shards existing on the origin node, >> and >> > it's my responsibility to move them elsewhere? >> > >> > Shai >> >> >> >> -- >> - >> Noble Paul >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >>
Re: Shard splitting and replica placement strategy
Thanks Noble, I'll try to reproduce in a test then. Does the rule I've set sound right to you though? On Wed, Feb 24, 2016, 15:19 Noble Paulwrote: > Whatever it is , there should be no NPE. could be a bug > > On Wed, Feb 24, 2016 at 6:23 PM, Shai Erera wrote: > > Hi > > > > I wanted to try out the (relatively) new replica placement strategy and > how > > it plays with shard splitting. So I set up a 4-node cluster, created a > > collection with 1 shard and 2 replicas (each created on a different) > node. > > > > When I issue a SPLITSHARD command (without any rules set on the > collection), > > the split finishes successfully and the state of the cluster is: > > > > n1: s1_r1 (INACTIVE), s1_0_r1, s1_1_r1 > > n2: s1_r2 (INACTIVE), s1_0_r2 > > n3: s1_1_r2 > > n4: empty > > > > So far as expected, since the shard splitting occurred on n1, the two sub > > shards were created there, and then Solr filled the missing replicas on > > nodes 2 and 3. Also the source shard s1 was set to INACTIVE and I did not > > delete it (in the test). > > > > Then I tried the same, curious if I set the right rule, one of the > > sub-shards' replicas will move to the 4th node, so I end up w/ a > "balanced" > > cluster. So I created the collection with the rule: > > "shard:**,replica:<2,node:*", which per the ref guide says that I should > end > > with no more than one replica per shard on every node. Per my > understanding, > > I should end up with either 2 nodes each holding one replica of each > shard, > > 3 nodes holding a mixture of replicas or 4 nodes each holds exactly one > > replica. > > > > However, while observing the cluster status I noticed that the two > created > > sub-shards are marked as ACTIVE and leader, while the two others are > marked > > in DOWN. Turning on INFO logging I found this: > > > > Caused by: java.lang.NullPointerException at > > > org.apache.solr.cloud.rule.Rule.getNumberOfNodesWithSameTagVal(Rule.java:168) > > at org.apache.solr.cloud.rule.Rule.tryAssignNodeToShard(Rule.java:130) at > > > org.apache.solr.cloud.rule.ReplicaAssigner.tryAPermutationOfRules(ReplicaAssigner.java:252) > > at > > > org.apache.solr.cloud.rule.ReplicaAssigner.tryAllPermutations(ReplicaAssigner.java:203) > > at > > > org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings0(ReplicaAssigner.java:174) > > at > > > org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings(ReplicaAssigner.java:135) > > at org.apache.solr.cloud.Assign.getNodesViaRules(Assign.java:211) at > > org.apache.solr.cloud.Assign.getNodesForNewReplicas(Assign.java:179) at > > > org.apache.solr.cloud.OverseerCollectionMessageHandler.addReplica(OverseerCollectionMessageHandler.java:2204) > > at > > > org.apache.solr.cloud.OverseerCollectionMessageHandler.splitShard(OverseerCollectionMessageHandler.java:1212) > > > > I also tried with the rule "replica:<2,node:*" which yielded the same > NPE. I > > run on 5.4.1 and I couldn't find if this is something that was fixed in > > 5.5.0/master already. So the question is -- is this a bug or did I > > misconfigure the rule? > > > > And as a side question, is there any rule which I can configure so that > the > > split shards are distributed evenly in the cluster? Or currently > SPLITSHARD > > will always result in the created shards existing on the origin node, and > > it's my responsibility to move them elsewhere? > > > > Shai > > > > -- > - > Noble Paul > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Shard splitting and replica placement strategy
Whatever it is , there should be no NPE. could be a bug On Wed, Feb 24, 2016 at 6:23 PM, Shai Ererawrote: > Hi > > I wanted to try out the (relatively) new replica placement strategy and how > it plays with shard splitting. So I set up a 4-node cluster, created a > collection with 1 shard and 2 replicas (each created on a different) node. > > When I issue a SPLITSHARD command (without any rules set on the collection), > the split finishes successfully and the state of the cluster is: > > n1: s1_r1 (INACTIVE), s1_0_r1, s1_1_r1 > n2: s1_r2 (INACTIVE), s1_0_r2 > n3: s1_1_r2 > n4: empty > > So far as expected, since the shard splitting occurred on n1, the two sub > shards were created there, and then Solr filled the missing replicas on > nodes 2 and 3. Also the source shard s1 was set to INACTIVE and I did not > delete it (in the test). > > Then I tried the same, curious if I set the right rule, one of the > sub-shards' replicas will move to the 4th node, so I end up w/ a "balanced" > cluster. So I created the collection with the rule: > "shard:**,replica:<2,node:*", which per the ref guide says that I should end > with no more than one replica per shard on every node. Per my understanding, > I should end up with either 2 nodes each holding one replica of each shard, > 3 nodes holding a mixture of replicas or 4 nodes each holds exactly one > replica. > > However, while observing the cluster status I noticed that the two created > sub-shards are marked as ACTIVE and leader, while the two others are marked > in DOWN. Turning on INFO logging I found this: > > Caused by: java.lang.NullPointerException at > org.apache.solr.cloud.rule.Rule.getNumberOfNodesWithSameTagVal(Rule.java:168) > at org.apache.solr.cloud.rule.Rule.tryAssignNodeToShard(Rule.java:130) at > org.apache.solr.cloud.rule.ReplicaAssigner.tryAPermutationOfRules(ReplicaAssigner.java:252) > at > org.apache.solr.cloud.rule.ReplicaAssigner.tryAllPermutations(ReplicaAssigner.java:203) > at > org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings0(ReplicaAssigner.java:174) > at > org.apache.solr.cloud.rule.ReplicaAssigner.getNodeMappings(ReplicaAssigner.java:135) > at org.apache.solr.cloud.Assign.getNodesViaRules(Assign.java:211) at > org.apache.solr.cloud.Assign.getNodesForNewReplicas(Assign.java:179) at > org.apache.solr.cloud.OverseerCollectionMessageHandler.addReplica(OverseerCollectionMessageHandler.java:2204) > at > org.apache.solr.cloud.OverseerCollectionMessageHandler.splitShard(OverseerCollectionMessageHandler.java:1212) > > I also tried with the rule "replica:<2,node:*" which yielded the same NPE. I > run on 5.4.1 and I couldn't find if this is something that was fixed in > 5.5.0/master already. So the question is -- is this a bug or did I > misconfigure the rule? > > And as a side question, is there any rule which I can configure so that the > split shards are distributed evenly in the cluster? Or currently SPLITSHARD > will always result in the created shards existing on the origin node, and > it's my responsibility to move them elsewhere? > > Shai -- - Noble Paul - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org