[
https://issues.apache.org/jira/browse/SOLR-12866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16666238#comment-16666238
]
Varun Thacker commented on SOLR-12866:
--------------------------------------
This is a snippet of the RestoreCmd command that does the replica placement
{code:java}
Assign.AssignRequest assignRequest = new Assign.AssignRequestBuilder()
.forCollection(restoreCollectionName)
.forShard(sliceNames)
.assignNrtReplicas(numNrtReplicas)
.assignTlogReplicas(numTlogReplicas)
.assignPullReplicas(numPullReplicas)
.onNodes(nodeList)
.build();
Assign.AssignStrategyFactory assignStrategyFactory = new
Assign.AssignStrategyFactory(ocmh.cloudManager);
Assign.AssignStrategy assignStrategy =
assignStrategyFactory.create(clusterState, restoreCollection);
List<ReplicaPosition> replicaPositions =
assignStrategy.assign(ocmh.cloudManager, assignRequest);
sessionWrapper = PolicyHelper.getLastSessionWrapper(true);{code}
Now we have two nodes and 3 shards to create and this ends up assigning all of
them to one node and hence the test fails.
I'm trying to isolate this in a mock and test more . For reference when I
stepped through this code in debug mode here is what i saw
{code:java}
result = {Assign$AssignRequest@6811}
collectionName = "backuprestore_restored"
shardNames = {ArrayList@6741} size = 3
nodes = {ArrayList@6719} size = 2
numNrtReplicas = 1
numTlogReplicas = 0
numPullReplicas = 0
replicaPositions = {ArrayList@6785} size = 3
0 = {ReplicaPosition@6793} "shard2:1[NRT] @127.0.0.1:61176_solr"
1 = {ReplicaPosition@6794} "shard1_1:1[NRT] @127.0.0.1:61176_solr"
2 = {ReplicaPosition@6795} "shard1_0:1[NRT] @127.0.0.1:61176_solr"{code}
> Reproducing TestLocalFSCloudBackupRestore and TestHdfsCloudBackupRestore
> failures
> ---------------------------------------------------------------------------------
>
> Key: SOLR-12866
> URL: https://issues.apache.org/jira/browse/SOLR-12866
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Steve Rowe
> Assignee: Varun Thacker
> Priority: Major
>
> From [https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-7.x/185/],
> both tests failed 10/10 iterations for me on branch_7x with the seed:
> {noformat}
> Checking out Revision 37fdcb02d87ec44293ec4942c75a3cb709c45418
> (refs/remotes/origin/branch_7x)
> [...]
> [junit4] 2> NOTE: reproduce with: ant test
> -Dtestcase=TestLocalFSCloudBackupRestore -Dtests.method=test
> -Dtests.seed=3CD4284489C09DB4 -Dtests.multiplier=2 -Dtests.slow=true
> -Dtests.badapples=true -Dtests.locale=mk-MK
> -Dtests.timezone=Pacific/Kiritimati -Dtests.asserts=true
> -Dtests.file.encoding=US-ASCII
> [junit4] FAILURE 10.8s J2 | TestLocalFSCloudBackupRestore.test <<<
> [junit4] > Throwable #1: java.lang.AssertionError: Node
> 127.0.0.1:43864_solr has 3 replicas. Expected num replicas : 2. state:
> [junit4] >
> DocCollection(backuprestore_restored//collections/backuprestore_restored/state.json/9)={
> [junit4] > "pullReplicas":0,
> [junit4] > "replicationFactor":1,
> [junit4] > "shards":{
> [junit4] > "shard2":{
> [junit4] > "range":"0-7fffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node62":{
> [junit4] > "core":"backuprestore_restored_shard2_replica_n61",
> [junit4] > "base_url":"https://127.0.0.1:43864/solr",
> [junit4] > "node_name":"127.0.0.1:43864_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459703266853250"},
> [junit4] > "shard1_1":{
> [junit4] > "range":"c0000000-ffffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node64":{
> [junit4] >
> "core":"backuprestore_restored_shard1_1_replica_n63",
> [junit4] > "base_url":"https://127.0.0.1:43864/solr",
> [junit4] > "node_name":"127.0.0.1:43864_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459703266887720"},
> [junit4] > "shard1_0":{
> [junit4] > "range":"80000000-bfffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node66":{
> [junit4] >
> "core":"backuprestore_restored_shard1_0_replica_n65",
> [junit4] > "base_url":"https://127.0.0.1:43864/solr",
> [junit4] > "node_name":"127.0.0.1:43864_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459703266910800"}},
> [junit4] > "router":{
> [junit4] > "name":"compositeId",
> [junit4] > "field":"shard_s"},
> [junit4] > "maxShardsPerNode":"-1",
> [junit4] > "autoAddReplicas":"false",
> [junit4] > "nrtReplicas":1,
> [junit4] > "tlogReplicas":0}
> [junit4] > at
> __randomizedtesting.SeedInfo.seed([3CD4284489C09DB4:B480179E273CF04C]:0)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.lambda$testBackupAndRestore$1(AbstractCloudBackupRestoreTestCase.java:339)
> [junit4] > at java.util.HashMap.forEach(HashMap.java:1289)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.testBackupAndRestore(AbstractCloudBackupRestoreTestCase.java:338)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.test(AbstractCloudBackupRestoreTestCase.java:144)
> [junit4] > at
> org.apache.solr.cloud.api.collections.TestLocalFSCloudBackupRestore.test(TestLocalFSCloudBackupRestore.java:64)
> [junit4] > at java.lang.Thread.run(Thread.java:748)
> {noformat}
> {noformat}
> [junit4] 2> NOTE: reproduce with: ant test
> -Dtestcase=TestHdfsCloudBackupRestore -Dtests.method=test
> -Dtests.seed=3CD4284489C09DB4 -Dtests.multiplier=2 -Dtests.slow=true
> -Dtests.badapples=true -Dtests.locale=bg -Dtests.timezone=Africa/Khartoum
> -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
> [junit4] FAILURE 13.3s J0 | TestHdfsCloudBackupRestore.test <<<
> [junit4] > Throwable #1: java.lang.AssertionError: Node
> 127.0.0.1:38450_solr has 3 replicas. Expected num replicas : 2. state:
> [junit4] >
> DocCollection(hdfsbackuprestore_restored//collections/hdfsbackuprestore_restored/state.json/10)={
> [junit4] > "pullReplicas":0,
> [junit4] > "replicationFactor":1,
> [junit4] > "shards":{
> [junit4] > "shard2":{
> [junit4] > "range":"0-7fffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node62":{
> [junit4] >
> "core":"hdfsbackuprestore_restored_shard2_replica_n61",
> [junit4] > "base_url":"https://127.0.0.1:38450/solr",
> [junit4] > "node_name":"127.0.0.1:38450_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459705812441110"},
> [junit4] > "shard1_1":{
> [junit4] > "range":"c0000000-ffffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node64":{
> [junit4] >
> "core":"hdfsbackuprestore_restored_shard1_1_replica_n63",
> [junit4] > "base_url":"https://127.0.0.1:38450/solr",
> [junit4] > "node_name":"127.0.0.1:38450_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459705812477955"},
> [junit4] > "shard1_0":{
> [junit4] > "range":"80000000-bfffffff",
> [junit4] > "state":"active",
> [junit4] > "replicas":{"core_node66":{
> [junit4] >
> "core":"hdfsbackuprestore_restored_shard1_0_replica_n65",
> [junit4] > "base_url":"https://127.0.0.1:38450/solr",
> [junit4] > "node_name":"127.0.0.1:38450_solr",
> [junit4] > "state":"active",
> [junit4] > "type":"NRT",
> [junit4] > "force_set_state":"false",
> [junit4] > "leader":"true"}},
> [junit4] > "stateTimestamp":"1539459705812506250"}},
> [junit4] > "router":{
> [junit4] > "name":"compositeId",
> [junit4] > "field":"shard_s"},
> [junit4] > "maxShardsPerNode":"-1",
> [junit4] > "autoAddReplicas":"false",
> [junit4] > "nrtReplicas":1,
> [junit4] > "tlogReplicas":0}
> [junit4] > at
> __randomizedtesting.SeedInfo.seed([3CD4284489C09DB4:B480179E273CF04C]:0)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.lambda$testBackupAndRestore$1(AbstractCloudBackupRestoreTestCase.java:339)
> [junit4] > at java.util.HashMap.forEach(HashMap.java:1289)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.testBackupAndRestore(AbstractCloudBackupRestoreTestCase.java:338)
> [junit4] > at
> org.apache.solr.cloud.api.collections.AbstractCloudBackupRestoreTestCase.test(AbstractCloudBackupRestoreTestCase.java:144)
> [junit4] > at
> org.apache.solr.cloud.api.collections.TestHdfsCloudBackupRestore.test(TestHdfsCloudBackupRestore.java:213)
> [junit4] > at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]