Can you paste the state machine definition as well.
On Apr 25, 2016 8:41 PM, "Neutron sharc" <[email protected]> wrote:
> In my small test I have only 1 resource with 4 partitions. Every
> partition has its unique name.
>
> Looking at the log output, it seems that every state transition
> message is sent twice to the same target. Please see the complete
> log at end of this post. I experimented with both SEMI_AUTO and
> USER_DEFINED
>
> Here is the ideal state of the only resource (Pool0).
> {
> "id" : "Pool0",
> "listFields" : {
> "Pool0_0" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
> "Pool0_1" : [ "host1_disk1", "host1_disk2", "host1_disk3" ],
> "Pool0_2" : [ "host1_disk2", "host1_disk1", "host1_disk3" ],
> "Pool0_3" : [ "host1_disk3", "host1_disk1", "host1_disk2" ]
> },
> "mapFields" : {
> "Pool0_0" : {
> "host1_disk1" : "MASTER",
> "host1_disk2" : "SLAVE",
> "host1_disk3" : "SLAVE"
> },
> "Pool0_1" : {
> "host1_disk1" : "MASTER",
> "host1_disk2" : "SLAVE",
> "host1_disk3" : "SLAVE"
> },
> "Pool0_2" : {
> "host1_disk1" : "SLAVE",
> "host1_disk2" : "MASTER",
> "host1_disk3" : "SLAVE"
> },
> "Pool0_3" : {
> "host1_disk1" : "SLAVE",
> "host1_disk2" : "SLAVE",
> "host1_disk3" : "MASTER"
> }
> },
> "simpleFields" : {
> "IDEAL_STATE_MODE" : "AUTO",
> "MAX_PARTITIONS_PER_INSTANCE" : "1",
> "NUM_PARTITIONS" : "4",
> "REBALANCE_MODE" : "SEMI_AUTO",
> "REPLICAS" : "3",
> "STATE_MODEL_DEF_REF" : "HcdDiskStateModel",
> "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
> }
> }
>
> ======
> log output:
>
> [INFO 2016-04-25 20:13:46,062 com.hcd.hcdadmin.HcdAdmin:502] will
> create pool Pool0
>
> [INFO 2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:242] found
> available live disks:
>
> [INFO 2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk1
>
> [INFO 2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk2
>
> [INFO 2016-04-25 20:13:46,120 com.hcd.hcdadmin.HcdAdmin:244] host1_disk3
>
> [INFO 2016-04-25 20:13:46,272 com.hcd.hcdadmin.HcdPool:227] assign
> pool partition Pool0_0 to disks: [host1_disk3, host1_disk2,
> host1_disk1]
>
> [INFO 2016-04-25 20:13:46,456 com.hcd.hcdadmin.HcdPool:227] assign
> pool partition Pool0_1 to disks: [host1_disk3, host1_disk2,
> host1_disk1]
>
> [INFO 2016-04-25 20:13:46,610 com.hcd.hcdadmin.HcdPool:227] assign
> pool partition Pool0_2 to disks: [host1_disk3, host1_disk2,
> host1_disk1]
>
> [INFO 2016-04-25 20:13:46,769 com.hcd.hcdadmin.HcdPool:227] assign
> pool partition Pool0_3 to disks: [host1_disk3, host1_disk2,
> host1_disk1]
>
> [INFO 2016-04-25 20:13:46,905 com.hcd.hcdadmin.HcdPool:265] assigned
> ideal state to LUN Pool0
>
> [INFO 2016-04-25 20:13:47,428 com.hcd.hcdadmin.HcdAdmin:261] has
> created pool Pool0 in cluster TryHelixCluster1
>
> [INFO 2016-04-25 20:13:47,429 com.hcd.hcdadmin.HcdAdmin:505]
> rebalance for pool Pool0
>
> [INFO 2016-04-25 20:13:47,881
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,881
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,887
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,912
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,913
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,920
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,954
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,954
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:47,972
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,009
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,010
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,015
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,265
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,265
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,269
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,278
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_2: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,284
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,286
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,302
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,316
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,319
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,340
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk3 :
> Pool0/Pool0_1: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,348
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk2 :
> Pool0/Pool0_3: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,349
> com.hcd.hcdadmin.HcdDiskStateModelFactory:111] host1_disk1 :
> Pool0/Pool0_0: transit from OFFLINE to SLAVE
>
> [INFO 2016-04-25 20:13:48,461
> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk2 :
> Pool0/Pool0_2: transit from SLAVE to MASTER
>
> [INFO 2016-04-25 20:13:48,464
> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk3 :
> Pool0/Pool0_3: transit from SLAVE to MASTER
>
> [INFO 2016-04-25 20:13:48,465
> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
> Pool0/Pool0_0: transit from SLAVE to MASTER
>
> [INFO 2016-04-25 20:13:48,466
> com.hcd.hcdadmin.HcdDiskStateModelFactory:125] host1_disk1 :
> Pool0/Pool0_1: transit from SLAVE to MASTER
>
> [ERROR 2016-04-25 20:13:48,544
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
> from: host1_admin, to: host1_disk2
>
> [ERROR 2016-04-25 20:13:48,549
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
> from: host1_admin, to: host1_disk3
>
> [ERROR 2016-04-25 20:13:48,549
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
> from: host1_admin, to: host1_disk1
>
> [ERROR 2016-04-25 20:13:48,555
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
> from: host1_admin, to: host1_disk1
>
> [ERROR 2016-04-25 20:13:48,556
> org.apache.helix.messaging.handling.HelixTask:143] Message execution
> failed. msgId: 4e18677e-a4d1-41ee-bb71-fea6f5e2c5c3, errorMsg:
>
> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_2,
> from: host1_admin, to: host1_disk2
>
> [ERROR 2016-04-25 20:13:48,559
> org.apache.helix.messaging.handling.HelixTask:143] Message execution
> failed. msgId: 1199845a-1e49-4ea0-9a53-b52bf5afd816, errorMsg:
>
> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_3,
> from: host1_admin, to: host1_disk3
>
> [ERROR 2016-04-25 20:13:48,560
> org.apache.helix.messaging.handling.HelixTask:143] Message execution
> failed. msgId: 35f504ce-b332-4fd4-a893-1a400047621e, errorMsg:
>
> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_0,
> from: host1_admin, to: host1_disk1
>
> [ERROR 2016-04-25 20:13:48,573
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
> Skip internal error. errCode: ERROR, errMsg: Current state of
> stateModel does not match the fromState in Message, Current
> State:MASTER, message expected:SLAVE, partition: Pool0_2, from:
> host1_admin, to: host1_disk2
>
> [ERROR 2016-04-25 20:13:48,576
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
> Skip internal error. errCode: ERROR, errMsg: Current state of
> stateModel does not match the fromState in Message, Current
> State:MASTER, message expected:SLAVE, partition: Pool0_0, from:
> host1_admin, to: host1_disk1
>
> [ERROR 2016-04-25 20:13:48,577
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
> Skip internal error. errCode: ERROR, errMsg: Current state of
> stateModel does not match the fromState in Message, Current
> State:MASTER, message expected:SLAVE, partition: Pool0_3, from:
> host1_admin, to: host1_disk3
>
> [ERROR 2016-04-25 20:13:48,580
> org.apache.helix.messaging.handling.HelixTask:143] Message execution
> failed. msgId: 7573c64d-e055-4eb1-b845-e22c82542437, errorMsg:
>
> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
> Current state of stateModel does not match the fromState in Message,
> Current State:MASTER, message expected:SLAVE, partition: Pool0_1,
> from: host1_admin, to: host1_disk1
>
> [ERROR 2016-04-25 20:13:48,594
> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
> Skip internal error. errCode: ERROR, errMsg: Current state of
> stateModel does not match the fromState in Message, Current
> State:MASTER, message expected:SLAVE, partition: Pool0_1, from:
> host1_admin, to: host1_disk1
>
>
> [INFO 2016-04-25 20:13:51,517 com.hcd.hcdadmin.HcdExternalView:68]
>
> Cluster TryHelixCluster1, show external view of LUNs
>
>
> host1_disk1 host1_disk2 host1_disk3
>
> Pool0_0 M S S
>
> Pool0_1 M S S
>
> Pool0_2 S M S
>
> Pool0_3 S S M
>
> On Sat, Apr 23, 2016 at 11:35 PM, kishore g <[email protected]> wrote:
> > How many resources do you have. Partition names must be unique across the
> > entire cluster. Can you also paste the idealstate for the resources
> >
> > On Sat, Apr 23, 2016 at 10:39 PM, Neutron sharc <[email protected]>
> > wrote:
> >
> >> Hi Helix team,
> >>
> >> I keep seeing this error from HelixStateTransitionHandler when the
> >> state machine is running. It seems a partition's actual state doesn't
> >> match with the state marked in controller message. What are the usual
> >> causes? I'm using helix
> >> 0.7.1. Here is my maven pom.xml:
> >>
> >> <dependency>
> >> <groupId>org.apache.helix</groupId>
> >> <artifactId>helix-core</artifactId>
> >> <version>0.7.1</version>
> >> </dependency>
> >>
> >>
> >>
> >> [ERROR 2016-04-21 19:51:09,943
> >> org.apache.helix.messaging.handling.HelixStateTransitionHandler:118]
> >> Current state of stateModel does not match the fromState in Message,
> >> Current State:MASTER, message expected:SLAVE, partition:
> >> host1_Pool0_0, from: host1_admin, to: host1_disk1
> >>
> >> [ERROR 2016-04-21 19:51:09,959
> >> org.apache.helix.messaging.handling.HelixTask:143] Message execution
> >> failed. msgId: 26c891b8-dd81-4e0c-8b99-6c62b856db5f, errorMsg:
> >>
> >>
> org.apache.helix.messaging.handling.HelixStateTransitionHandler$HelixStateMismatchException:
> >> Current state of stateModel does not match the fromState in Message,
> >> Current State:MASTER, message expected:SLAVE, partition:
> >> host1_Pool0_0, from: host1_admin, to: host1_disk1
> >>
> >> [ERROR 2016-04-21 19:51:09,975
> >> org.apache.helix.messaging.handling.HelixStateTransitionHandler:385]
> >> Skip internal error. errCode: ERROR, errMsg: Current state of
> >> stateModel does not match the fromState in Message, Current
> >> State:MASTER, message expected:SLAVE, partition: host1_Pool0_0, from:
> >> host1_admin, to: host1_disk1
> >>
> >>
> >> Another problem I see is: my ideal state defines a partition has 3
> >> replicas, but the resource's external view shows sometime that a
> >> partition has 4 replicas.
> >>
> >>
> >> Any hints? Thanks!
> >>
>