Thanks for adding the test case. So looks like I just have to remove the INSTANCE constraint.
Sent from my iPad On May 8, 2013, at 7:18 PM, Zhen Zhang <[email protected]> wrote: > Hi Ming, I've added a test case for this, see TestMessageThrottle2.java. It > is just a copy of your example with minor changes. > > https://github.com/apache/incubator-helix/blob/master/helix-core/src/test/java/org/apache/helix/integration/TestMessageThrottle2.java > > > At step 3) when you are adding Node-1, there are three state transition > messages need to be sent: > T1) Offline->Slave for Node-1 > T2) Master->Slave for Node-2 > T3) Slave->Master for Node-1 > > Note that T1 and T2 can be sent together. If you are using instance level > constraint like this: > // limit one transition message at a time for each instance > builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION") > .addConstraintAttribute("INSTANCE", ".*") > .addConstraintAttribute("CONSTRAINT_VALUE", "1"); > > Then T1 and T2 will be sent together in the first round since T1 and T2 are > sent to two different nodes. And T3 will be sent in the next round. > > If you are specifying a cluster level constraint like this: > // limit one transition message at a time for the entire cluster > builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION") > .addConstraintAttribute("CONSTRAINT_VALUE", "1"); > > Then helix controller will send T1 in the first round; then send T2; then T3. > The reason why T1 is sent before T2 is because in the state model definition, > you specified that Offline->Slave transition has a higher priority than > Master->Slave. > > The test runs without problem. Here is the output: > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Start zookeeper at localhost:2183 in thread main > START TestMessageThrottle2 at Wed May 08 15:57:21 PDT 2013 > Creating cluster: TestMessageThrottle2 > Starting Controller{Cluster:TestMessageThrottle2, Port:12000, > Zookeeper:localhost:2183} > StatusPrinter.onIdealStateChange:state = MyResource, {IDEAL_STATE_MODE=AUTO, > NUM_PARTITIONS=1, REPLICAS=2, > STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{}{} > StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@6e3404f > StatusPrinter.onInstanceConfigChange:instanceConfig = node2, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node2, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60006}{}{} > StatusPrinter.onIdealStateChange:state = MyResource, {IDEAL_STATE_MODE=AUTO, > NUM_PARTITIONS=1, REPLICAS=2, > STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]} > StatusPrinter.onInstanceConfigChange:instanceConfig = node2, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node2, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60006}{}{} > StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@76d3046 > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{} > StatusPrinter.onInstanceConfigChange:instanceConfig = node1, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onInstanceConfigChange:instanceConfig = node2, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{} > StatusPrinter.onInstanceConfigChange:instanceConfig = node1, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onInstanceConfigChange:instanceConfig = node2, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node1, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60008}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node2, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60006}{}{} > StatusPrinter.onIdealStateChange:state = MyResource, {IDEAL_STATE_MODE=AUTO, > NUM_PARTITIONS=1, REPLICAS=2, > STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]} > StatusPrinter.onInstanceConfigChange:instanceConfig = node1, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onInstanceConfigChange:instanceConfig = node2, > {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node1, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60008}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node2, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60006}{}{} > StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@b9deddb > StatusPrinter.onLiveInstanceChange:liveInstance = node1, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60008}{}{} > StatusPrinter.onLiveInstanceChange:liveInstance = node2, > {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1, > SESSION_ID=13e865cfca60006}{}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node1=SLAVE, node2=MASTER}}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node1=SLAVE, node2=MASTER}}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{} > StatusPrinter.onExternalViewChange:externalView = MyResource, > {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{} > true: wait 489ms, > ClusterStateVerifier$BestPossAndExtViewZkVerifier(TestMessageThrottle2@localhost:2183) > END TestMessageThrottle2 at Wed May 08 15:57:30 PDT 2013 > > Thanks, > Jason > > > > > On Tue, May 7, 2013 at 8:25 PM, Ming Fang <[email protected]> wrote: >> Here is the code that I'm using to test >> https://github.com/mingfang/apache-helix/tree/master/helix-example >> >> In ZAC.java line 134 is where I'm adding the constraint. >> Line 204 is where I'm setting the state transition priority list. >> >> The steps I'm using is >> 1-Run ZAC and wait for the StatusPrinter printouts >> 2-Run Node2 and wait for it to transition to MASTER >> 3-Run Node1 >> At this point we see the problem where the external view will say >> node1=SLAVE and node2=SLAVE. >> >> I can get the MessageThrottleStage to work by replacing line 205 with this >> String key=item.toString(); >> But even with message throttle working I can can't get the transition >> sequence I need. >> >> >> On May 7, 2013, at 11:43 AM, kishore g <[email protected]> wrote: >> >>> Can you give provide the code snippet you used to add the constraint. Looks >>> like you are setting constraint at INSTANCE level. >>> >>> >>> >>> >>> On Mon, May 6, 2013 at 9:52 PM, Ming Fang <[email protected]> wrote: >>>> I almost have this working. >>>> However I'm experiencing a potential bug in MessageThrottleStage line 205. >>>> The problem is that the throttleMap's key contains the INSTANCE=<id> in it. >>>> This effectively makes trying to throttle across the entire cluster >>>> impossible. >>>> >>>> On Apr 24, 2013, at 2:07 PM, Zhen Zhang <[email protected]> wrote: >>>> >>>> > Hi Ming, to set the constraint so that only one transition message at a >>>> > time, you can take a look at the test example of TestMessageThrottle. You >>>> > need to add a message constraint as follows: >>>> > >>>> > // build a message constraint >>>> > ConstraintItemBuilder builder = new ConstraintItemBuilder(); >>>> > builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION") >>>> > .addConstraintAttribute("INSTANCE", ".*") >>>> > .addConstraintAttribute("CONSTRAINT_VALUE", "1"); >>>> > >>>> > // add the constraint to the cluster >>>> > helixAdmin.setConstraint(clusterName, ConstraintType.MESSAGE_CONSTRAINT, >>>> > "constraint1", builder.build()); >>>> > >>>> > >>>> > Message constraint is separate from ideal state and is not specified in >>>> > the JSON file of the ideal state. >>>> > >>>> > Thanks, >>>> > Jason >>>> > >>>> > >>>> > >>>> > >>>> > On 4/23/13 2:40 PM, "Ming Fang" <[email protected]> wrote: >>>> > >>>> >> Kishore >>>> >> >>>> >> It sounds like the solution is to set the constraints so that only one >>>> >> transition at a time. >>>> >> Can you point me to an example of how to do this? >>>> >> Also is this something I can set in the JSON file? >>>> >> >>>> >> Sent from my iPad >>>> >> >>>> >> On Apr 1, 2013, at 11:32 AM, kishore g <[email protected]> wrote: >>>> >> >>>> >>> Hi Ming, >>>> >>> >>>> >>> Thanks for the detailed explanation. Actually 5 & 6 happen in >>>> >>> parallel, Helix tries to parallelize the transitions as much as >>>> >>> possible. >>>> >>> >>>> >>> There is another feature in Helix that allows you to sort the >>>> >>> transitions based on some priority.See STATE_TRANSITION_PRIORITY_LIST >>>> >>> in >>>> >>> state model definition. But after sorting Helix will send as many as >>>> >>> possible in parallel without violating constraints. >>>> >>> >>>> >>> In your case you want the priority to be S-M, O-S, M-S but that is not >>>> >>> sufficient since O-S and M-S will be sent in parallel. >>>> >>> >>>> >>> Additionally, what you need to do is set contraint on transition that >>>> >>> there should be only one transition per partition at any time. This >>>> >>> will >>>> >>> basically make the order 6 5 7 and they will be executed sequentially >>>> >>> per partition. >>>> >>> >>>> >>> We will try this out and let you know, you dont need to change any >>>> >>> code in Helix or your app. You should be able to tweak the >>>> >>> configuration >>>> >>> dynamically. >>>> >>> >>>> >>> We will try to think of solving this in a more elegant way. I will file >>>> >>> a jira and add more info. >>>> >>> >>>> >>> I also want to ask this question, when a node comes up if it is >>>> >>> mandatory to talk to MASTER what happens when the nodes are started for >>>> >>> the first time or when all nodes crash and come back. >>>> >>> >>>> >>> thanks, >>>> >>> Kishore G >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> > >>>> >>> >> >
