In my small test I have only 1 resource with 4 partitions. Every
partition has its unique name.

Looking at the log output, it seems that every state transition
message is sent twice to the same target.   Please see the complete
log at end of this post.  I experimented with both SEMI_AUTO and
USER_DEFINED

Here is the ideal state of the only resource (Pool0).
{
"id" : "Pool0",
"listFields" : {
"Pool0_0" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
"Pool0_1" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
"Pool0_2" : [ "host1_disk2", "host1_disk1", "host1_disk3" ],
"Pool0_3" : [ "host1_disk3", "host1_disk1", "host1_disk2" ]
},
"mapFields" : {
"Pool0_0" : {
"host1_disk1" : "MASTER",
"host1_disk2" : "SLAVE",
"host1_disk3" : "SLAVE"
},
"Pool0_1" : {
"host1_disk1" : "MASTER",
"host1_disk2" : "SLAVE",
"host1_disk3" : "SLAVE"
},
"Pool0_2" : {
"host1_disk1" : "SLAVE",
"host1_disk2" : "MASTER",
"host1_disk3" : "SLAVE"
},
"Pool0_3" : {
"host1_disk1" : "SLAVE",
"host1_disk2" : "SLAVE",
"host1_disk3" : "MASTER"
}
},
"simpleFields" : {
"IDEAL_STATE_MODE" : "AUTO",
"MAX_PARTITIONS_PER_INSTANCE" : "1",
"NUM_PARTITIONS" : "4",
"REBALANCE_MODE" : "SEMI_AUTO",
"REPLICAS" : "3",
"STATE_MODEL_DEF_REF" : "HcdDiskStateModel",
"STATE_MODEL_FACTORY_NAME" : "DEFAULT"
}
}

======
log output:

[INFO  2016-04-25 20:13:46,062 com.hcd.hcdadmin.HcdAdmin:502] will
create pool Pool0

[INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:242] found
available live disks:

[INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk1

[INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk2

[INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk3

[INFO  2016-04-25 20:13:46,272 com.hcd.hcdadmin.HcdPool:227] assign
pool partition Pool0_0 to disks: [host1_disk3, host1_disk2,
host1_disk1]

[INFO  2016-04-25 20:13:46,456 com.hcd.hcdadmin.HcdPool:227] assign
pool partition Pool0_1 to disks: [host1_disk3, host1_disk2,
host1_disk1]

[INFO  2016-04-25 20:13:46,610 com.hcd.hcdadmin.HcdPool:227] assign
pool partition Pool0_2 to disks: [host1_disk3, host1_disk2,
host1_disk1]

[INFO  2016-04-25 20:13:46,769 com.hcd.hcdadmin.HcdPool:227] assign
pool partition Pool0_3 to disks: [host1_disk3, host1_disk2,
host1_disk1]

[INFO  2016-04-25 20:13:46,905 com.hcd.hcdadmin.HcdPool:265] assigned
ideal state to LUN Pool0

[INFO  2016-04-25 20:13:47,428 com.hcd.hcdadmin.HcdAdmin:261] has
created pool Pool0 in cluster TryHelixCluster1

[INFO  2016-04-25 20:13:47,429 com.hcd.hcdadmin.HcdAdmin:505]
rebalance for pool Pool0

[INFO  2016-04-25 20:13:47,881
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,881
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,887
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,912
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,913
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,920
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,954
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,954
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:47,972
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,009
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,010
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,015
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,265
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,265
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,269
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,278
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_2: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,284
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,286
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,302
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,316
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,319
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,340
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
Pool0/Pool0_1: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,348
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
Pool0/Pool0_3: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,349
com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
Pool0/Pool0_0: transit from OFFLINE to SLAVE

[INFO  2016-04-25 20:13:48,461
com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk2 :
Pool0/Pool0_2: transit from SLAVE to MASTER

[INFO  2016-04-25 20:13:48,464
com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk3 :
Pool0/Pool0_3: transit from SLAVE to MASTER

[INFO  2016-04-25 20:13:48,465
com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
Pool0/Pool0_0: transit from SLAVE to MASTER

[INFO  2016-04-25 20:13:48,466
com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
Pool0/Pool0_1: transit from SLAVE to MASTER

[ERROR 2016-04-25 20:13:48,544
org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
from: host1_admin, to: host1_disk2

[ERROR 2016-04-25 20:13:48,549
org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
from: host1_admin, to: host1_disk3

[ERROR 2016-04-25 20:13:48,549
org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
from: host1_admin, to: host1_disk1

[ERROR 2016-04-25 20:13:48,555
org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
from: host1_admin, to: host1_disk1

[ERROR 2016-04-25 20:13:48,556
org.apache.helix.messaging.handling.HelixTask:143] Message execution
failed. msgId: 4e18677e-a4d1-41ee-bb71-fea6f5e2c5c3, errorMsg:
org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
from: host1_admin, to: host1_disk2

[ERROR 2016-04-25 20:13:48,559
org.apache.helix.messaging.handling.HelixTask:143] Message execution
failed. msgId: 1199845a-1e49-4ea0-9a53-b52bf5afd816, errorMsg:
org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
from: host1_admin, to: host1_disk3

[ERROR 2016-04-25 20:13:48,560
org.apache.helix.messaging.handling.HelixTask:143] Message execution
failed. msgId: 35f504ce-b332-4fd4-a893-1a400047621e, errorMsg:
org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
from: host1_admin, to: host1_disk1

[ERROR 2016-04-25 20:13:48,573
org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
Skip internal error. errCode: ERROR, errMsg: Current state of
stateModel does not match the fromState in Message, Current
State:MASTER, message expected:SLAVE, partition: Pool0_2, from:
host1_admin, to: host1_disk2

[ERROR 2016-04-25 20:13:48,576
org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
Skip internal error. errCode: ERROR, errMsg: Current state of
stateModel does not match the fromState in Message, Current
State:MASTER, message expected:SLAVE, partition: Pool0_0, from:
host1_admin, to: host1_disk1

[ERROR 2016-04-25 20:13:48,577
org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
Skip internal error. errCode: ERROR, errMsg: Current state of
stateModel does not match the fromState in Message, Current
State:MASTER, message expected:SLAVE, partition: Pool0_3, from:
host1_admin, to: host1_disk3

[ERROR 2016-04-25 20:13:48,580
org.apache.helix.messaging.handling.HelixTask:143] Message execution
failed. msgId: 7573c64d-e055-4eb1-b845-e22c82542437, errorMsg:
org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
Current state of stateModel does not match the fromState in Message,
Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
from: host1_admin, to: host1_disk1

[ERROR 2016-04-25 20:13:48,594
org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
Skip internal error. errCode: ERROR, errMsg: Current state of
stateModel does not match the fromState in Message, Current
State:MASTER, message expected:SLAVE, partition: Pool0_1, from:
host1_admin, to: host1_disk1


[INFO  2016-04-25 20:13:51,517 com.hcd.hcdadmin.HcdExternalView:68]

Cluster TryHelixCluster1, show external view of LUNs


host1_disk1 host1_disk2 host1_disk3

Pool0_0 M S S

Pool0_1 M S S

Pool0_2 S M S

Pool0_3 S S M

On Sat, Apr 23, 2016 at 11:35 PM, kishore g <[email protected]> wrote:
> How many resources do you have. Partition names must be unique across the
> entire cluster. Can you also paste the idealstate for the resources
>
> On Sat, Apr 23, 2016 at 10:39 PM, Neutron sharc <[email protected]>
> wrote:
>
>> Hi Helix team,
>>
>> I keep seeing this error from HelixStateTransitionHandler when the
>> state machine is running.  It seems a partition's actual state doesn't
>> match with the state marked in controller message.  What are the usual
>> causes?  I'm using helix
>> 0.7.1.  Here is my maven pom.xml:
>>
>> <dependency>
>>     <groupId>org.apache.helix</groupId>
>>     <artifactId>helix-core</artifactId>
>>     <version>0.7.1</version>
>> </dependency>
>>
>>
>>
>> [ERROR 2016-04-21 19:51:09,943
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition:
>> host1_Pool0_0, from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-21 19:51:09,959
>> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> failed. msgId: 26c891b8-dd81-4e0c-8b99-6c62b856db5f, errorMsg:
>>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition:
>> host1_Pool0_0, from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-21 19:51:09,975
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> Skip internal error. errCode: ERROR, errMsg: Current state of
>> stateModel does not match the fromState in Message, Current
>> State:MASTER, message expected:SLAVE, partition: host1_Pool0_0, from:
>> host1_admin, to: host1_disk1
>>
>>
>> Another problem I see is:  my ideal state defines a partition has 3
>> replicas, but the resource's external view shows sometime that a
>> partition has 4 replicas.
>>
>>
>> Any hints?  Thanks!
>>

Reply via email to