Here is my test state machine definition. Thanks for reviewing!

https://github.com/neutronsharc/tools/blob/master/HcdDiskStateModelFactory.java




On Mon, Apr 25, 2016 at 9:18 PM, kishore g <[email protected]> wrote:
> Can you paste the state machine definition as well.
> On Apr 25, 2016 8:41 PM, "Neutron sharc" <[email protected]> wrote:
>
>> In my small test I have only 1 resource with 4 partitions. Every
>> partition has its unique name.
>>
>> Looking at the log output, it seems that every state transition
>> message is sent twice to the same target.   Please see the complete
>> log at end of this post.  I experimented with both SEMI_AUTO and
>> USER_DEFINED
>>
>> Here is the ideal state of the only resource (Pool0).
>> {
>> "id" : "Pool0",
>> "listFields" : {
>> "Pool0_0" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
>> "Pool0_1" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
>> "Pool0_2" : [ "host1_disk2", "host1_disk1", "host1_disk3" ],
>> "Pool0_3" : [ "host1_disk3", "host1_disk1", "host1_disk2" ]
>> },
>> "mapFields" : {
>> "Pool0_0" : {
>> "host1_disk1" : "MASTER",
>> "host1_disk2" : "SLAVE",
>> "host1_disk3" : "SLAVE"
>> },
>> "Pool0_1" : {
>> "host1_disk1" : "MASTER",
>> "host1_disk2" : "SLAVE",
>> "host1_disk3" : "SLAVE"
>> },
>> "Pool0_2" : {
>> "host1_disk1" : "SLAVE",
>> "host1_disk2" : "MASTER",
>> "host1_disk3" : "SLAVE"
>> },
>> "Pool0_3" : {
>> "host1_disk1" : "SLAVE",
>> "host1_disk2" : "SLAVE",
>> "host1_disk3" : "MASTER"
>> }
>> },
>> "simpleFields" : {
>> "IDEAL_STATE_MODE" : "AUTO",
>> "MAX_PARTITIONS_PER_INSTANCE" : "1",
>> "NUM_PARTITIONS" : "4",
>> "REBALANCE_MODE" : "SEMI_AUTO",
>> "REPLICAS" : "3",
>> "STATE_MODEL_DEF_REF" : "HcdDiskStateModel",
>> "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
>> }
>> }
>>
>> ======
>> log output:
>>
>> [INFO  2016-04-25 20:13:46,062 com.hcd.hcdadmin.HcdAdmin:502] will
>> create pool Pool0
>>
>> [INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:242] found
>> available live disks:
>>
>> [INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk1
>>
>> [INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk2
>>
>> [INFO  2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk3
>>
>> [INFO  2016-04-25 20:13:46,272 com.hcd.hcdadmin.HcdPool:227] assign
>> pool partition Pool0_0 to disks: [host1_disk3, host1_disk2,
>> host1_disk1]
>>
>> [INFO  2016-04-25 20:13:46,456 com.hcd.hcdadmin.HcdPool:227] assign
>> pool partition Pool0_1 to disks: [host1_disk3, host1_disk2,
>> host1_disk1]
>>
>> [INFO  2016-04-25 20:13:46,610 com.hcd.hcdadmin.HcdPool:227] assign
>> pool partition Pool0_2 to disks: [host1_disk3, host1_disk2,
>> host1_disk1]
>>
>> [INFO  2016-04-25 20:13:46,769 com.hcd.hcdadmin.HcdPool:227] assign
>> pool partition Pool0_3 to disks: [host1_disk3, host1_disk2,
>> host1_disk1]
>>
>> [INFO  2016-04-25 20:13:46,905 com.hcd.hcdadmin.HcdPool:265] assigned
>> ideal state to LUN Pool0
>>
>> [INFO  2016-04-25 20:13:47,428 com.hcd.hcdadmin.HcdAdmin:261] has
>> created pool Pool0 in cluster TryHelixCluster1
>>
>> [INFO  2016-04-25 20:13:47,429 com.hcd.hcdadmin.HcdAdmin:505]
>> rebalance for pool Pool0
>>
>> [INFO  2016-04-25 20:13:47,881
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,881
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,887
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,912
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,913
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,920
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,954
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,954
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:47,972
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,009
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,010
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,015
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,265
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,265
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,269
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,278
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,284
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,286
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,302
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,316
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,319
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,340
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
>> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,348
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
>> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,349
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
>> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>>
>> [INFO  2016-04-25 20:13:48,461
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk2 :
>> Pool0/Pool0_2: transit from SLAVE to MASTER
>>
>> [INFO  2016-04-25 20:13:48,464
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk3 :
>> Pool0/Pool0_3: transit from SLAVE to MASTER
>>
>> [INFO  2016-04-25 20:13:48,465
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
>> Pool0/Pool0_0: transit from SLAVE to MASTER
>>
>> [INFO  2016-04-25 20:13:48,466
>> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
>> Pool0/Pool0_1: transit from SLAVE to MASTER
>>
>> [ERROR 2016-04-25 20:13:48,544
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
>> from: host1_admin, to: host1_disk2
>>
>> [ERROR 2016-04-25 20:13:48,549
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
>> from: host1_admin, to: host1_disk3
>>
>> [ERROR 2016-04-25 20:13:48,549
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
>> from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-25 20:13:48,555
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
>> from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-25 20:13:48,556
>> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> failed. msgId: 4e18677e-a4d1-41ee-bb71-fea6f5e2c5c3, errorMsg:
>>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
>> from: host1_admin, to: host1_disk2
>>
>> [ERROR 2016-04-25 20:13:48,559
>> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> failed. msgId: 1199845a-1e49-4ea0-9a53-b52bf5afd816, errorMsg:
>>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
>> from: host1_admin, to: host1_disk3
>>
>> [ERROR 2016-04-25 20:13:48,560
>> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> failed. msgId: 35f504ce-b332-4fd4-a893-1a400047621e, errorMsg:
>>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
>> from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-25 20:13:48,573
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> Skip internal error. errCode: ERROR, errMsg: Current state of
>> stateModel does not match the fromState in Message, Current
>> State:MASTER, message expected:SLAVE, partition: Pool0_2, from:
>> host1_admin, to: host1_disk2
>>
>> [ERROR 2016-04-25 20:13:48,576
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> Skip internal error. errCode: ERROR, errMsg: Current state of
>> stateModel does not match the fromState in Message, Current
>> State:MASTER, message expected:SLAVE, partition: Pool0_0, from:
>> host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-25 20:13:48,577
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> Skip internal error. errCode: ERROR, errMsg: Current state of
>> stateModel does not match the fromState in Message, Current
>> State:MASTER, message expected:SLAVE, partition: Pool0_3, from:
>> host1_admin, to: host1_disk3
>>
>> [ERROR 2016-04-25 20:13:48,580
>> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> failed. msgId: 7573c64d-e055-4eb1-b845-e22c82542437, errorMsg:
>>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> Current state of stateModel does not match the fromState in Message,
>> Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
>> from: host1_admin, to: host1_disk1
>>
>> [ERROR 2016-04-25 20:13:48,594
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> Skip internal error. errCode: ERROR, errMsg: Current state of
>> stateModel does not match the fromState in Message, Current
>> State:MASTER, message expected:SLAVE, partition: Pool0_1, from:
>> host1_admin, to: host1_disk1
>>
>>
>> [INFO  2016-04-25 20:13:51,517 com.hcd.hcdadmin.HcdExternalView:68]
>>
>> Cluster TryHelixCluster1, show external view of LUNs
>>
>>
>> host1_disk1 host1_disk2 host1_disk3
>>
>> Pool0_0 M S S
>>
>> Pool0_1 M S S
>>
>> Pool0_2 S M S
>>
>> Pool0_3 S S M
>>
>> On Sat, Apr 23, 2016 at 11:35 PM, kishore g <[email protected]> wrote:
>> > How many resources do you have. Partition names must be unique across the
>> > entire cluster. Can you also paste the idealstate for the resources
>> >
>> > On Sat, Apr 23, 2016 at 10:39 PM, Neutron sharc <[email protected]>
>> > wrote:
>> >
>> >> Hi Helix team,
>> >>
>> >> I keep seeing this error from HelixStateTransitionHandler when the
>> >> state machine is running.  It seems a partition's actual state doesn't
>> >> match with the state marked in controller message.  What are the usual
>> >> causes?  I'm using helix
>> >> 0.7.1.  Here is my maven pom.xml:
>> >>
>> >> <dependency>
>> >>     <groupId>org.apache.helix</groupId>
>> >>     <artifactId>helix-core</artifactId>
>> >>     <version>0.7.1</version>
>> >> </dependency>
>> >>
>> >>
>> >>
>> >> [ERROR 2016-04-21 19:51:09,943
>> >> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
>> >> Current state of stateModel does not match the fromState in Message,
>> >> Current State:MASTER, message expected:SLAVE, partition:
>> >> host1_Pool0_0, from: host1_admin, to: host1_disk1
>> >>
>> >> [ERROR 2016-04-21 19:51:09,959
>> >> org.apache.helix.messaging.handling.HelixTask:143] Message execution
>> >> failed. msgId: 26c891b8-dd81-4e0c-8b99-6c62b856db5f, errorMsg:
>> >>
>> >>
>> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
>> >> Current state of stateModel does not match the fromState in Message,
>> >> Current State:MASTER, message expected:SLAVE, partition:
>> >> host1_Pool0_0, from: host1_admin, to: host1_disk1
>> >>
>> >> [ERROR 2016-04-21 19:51:09,975
>> >> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
>> >> Skip internal error. errCode: ERROR, errMsg: Current state of
>> >> stateModel does not match the fromState in Message, Current
>> >> State:MASTER, message expected:SLAVE, partition: host1_Pool0_0, from:
>> >> host1_admin, to: host1_disk1
>> >>
>> >>
>> >> Another problem I see is:  my ideal state defines a partition has 3
>> >> replicas, but the resource's external view shows sometime that a
>> >> partition has 4 replicas.
>> >>
>> >>
>> >> Any hints?  Thanks!
>> >>
>>

Reply via email to