Hi Swaroop, Thanks for your input... In fact your description was my initial impression too... in fact my first PoC was using MasterSlave, but afterwards I got some other strong requirements like all nodes should be available to execute writes (redirect writes to master isn't much efficient if compared to a git pull). Other important thing to consider is that my Git cluster is usually (not always once you have external `clones`) consumed `local` (a web app that cluster is deployed alongside).. and the web app load balancer isn't something that I can control (my typical scenario is a mod_cluster with a JBoss application server cluster, and the Git cluster runs from inside that JBoss cluster).
Regards, --- Alexandre Porcelli [email protected] On May 21, 2013, at 4:15 AM, Swaroop Jagadish <[email protected]> wrote: > Hello Alexandre, > Based on your description, it looks like MasterSlave state model is best > suited for your use case. You distribute the different git repositories > evenly across the cluster using "auto rebalance" mode in the state model. > A git repo will be mapped to a Helix resource and for a given repo, there > is only one node which is the master. Thus, there is only one node which > can write to a given repo. The client uses Helix's external view in order > to determine which node is the master for a given repo(can be accomplished > using the RoutingTableProvider class). In order to keep the repositories > in sync, whenever a write happens at the master, the master can send a > message to all the slaves to sync their repos. A slave can either reject > any direct writes it receives from the client or forward it to the master > node. > > Let me know if that makes sense > > Regards, > Swaroop > > On 5/20/13 10:36 AM, "Alexandre Porcelli" <[email protected]> wrote: > >> Hi Kishore, >> >> Lemme try to explain my needs with my real world usage scenario, so >> would be easier to you understand. >> >> In a simple form, what I'm doing is a GIT cluster (using for that jgit >> and apache helix). External clients can push data to any node of the >> cluster, but in order to be able to have the cluster synced properly (to >> avoid conflicts) I need to be sure that just one node is writing at once >> (the single global lock role). Just before the current node `unlock` I >> notify all other members of the cluster (using Messaging API) that they >> must sync (the message points what was the updated repo). The unlock >> operation releases the lock (so others that may need update data can do >> it). >> My current setup to do that uses the "LeaderStandby" model with one >> resource (that I name it git-lock resource, with only one partition >> git-lock_0), the Leader is the node that holds the lock and the standby >> queue is formed by nodes that are willing to update data... nodes that >> are not trying to update data, aren't in standby (they're offline due the >> partition disabling). >> Aside from global lock, when a new node joins the cluster.. it needs >> sync all the git repositories - I don't have a fixed list of those repos, >> that is why I need to query the cluster asking for a list of existing >> repos. This query can be answered by any member of the existing cluster >> (once all of them are sync`ed with the global lock). >> >> Is it clear now? >> >> What I'm wondering is.. if I'm not trying to mix two different things at >> just one (single global lock and cluster's git repository list). >> >> Maybe it's worth to mention that in a near future I plan to get rid of >> the single global lock and have a per git repo lock... >> >> >> Again.. thanks in advance! >> >> Regards, >> --- >> Alexandre Porcelli >> [email protected] >> >> >> >> >> On May 20, 2013, at 2:05 PM, kishore g <[email protected]> wrote: >> >>> Hi Alex, >>> >>> Let me try to formulate your requirements >>> >>> 1. Have a global lock, of all nodes only one node needs to be LEADER >>> 2. When new nodes are added, they automatically become STANDBY and sync >>> data with existing LEADER >>> >>> Both the above requirements can be satisfied with AUTO_REBALANCE mode. >>> In your original email, you mentioned about releasing the lock, can you >>> explain when do you want to release the lock. Sorry I should have asked >>> this earlier. I think this is the requirement that is causing some >>> confusion. Also in 0.6.1 we have added a feature where you can plugin >>> custom rebalancer logic when the pipeline is run so you can actually >>> come up with your custom rebalancing logic. But, its not documented :( >>> >>> You might be right about using two state models or configure Helix with >>> a custom state model. But I want to make sure I understand your use case >>> before suggesting that. >>> >>> thanks, >>> Kishore G >>> >>> >>> >>> On Mon, May 20, 2013 at 9:17 AM, Alexandre Porcelli >>> <[email protected]> wrote: >>> Hi Kishore, >>> >>> Once again, thanks for your support... it has been really valuable. >>> >>> I've been thinking and I'd like to share my thought and ask your (any >>> comments are welcomed) opinion about it. My general need (I think I've >>> already wrote about it, but here just a small recap) is a single global >>> lock to control data changes and, in the same time check the current >>> state of a live node in order to be able to sync when a new node joins >>> the cluster. >>> >>> My latest questions about been able to manipulate transitions from API >>> was to avoid to have a node in offline mode - as moving away from >>> offline is the transition that triggers the sync, and if I disable a >>> resource/node I'm redirected to offline automatically (using >>> AUTO_REBALANCE). Kishore pointed me how to change my cluster from >>> AUTO_REBALANCE to AUTO so I can have control of those transitions.... >>> >>> Now here is what I've been thinking about all of this: seems that I'm >>> mixing two different things in just one cluster/resource - one is the >>> lock and other is the cluster availability - maybe I'd just need to have >>> two different resources for that, one for lock and other for the the >>> real data availability - Wdyt? Another thing that would come to my mind >>> is that maybe my need doesn't fit to existing state models, and I'd need >>> to create a new one with my own config. >>> >>> I'd like to hear what you think about it... recommendations? thoughts, >>> opinions, considerations... anything is welcomed. >>> >>> Regards, >>> --- >>> Alexandre Porcelli >>> [email protected] >>> >>> >>> On May 17, 2013, at 4:40 AM, kishore g <[email protected]> wrote: >>> >>>> Hi Alexandre, >>>> >>>> You can get more control in AUTO mode, you are currently using >>> AUTO_REBALANCE where Helix decides who should be leader and where should >>> it be. If you look at Idealstate it basically looks like this. >>>> p1:[] >>>> >>>> In Auto mode you set the preference list for each partition >>>> so you can set something like p1:[n1,n2,n3] >>>> >>>> In this case if n1 is alive, helix will make n1 the leader n2 n3 will >>> be standby. If you want to make some one else leader, say n2 simply >>> change this to >>>> p1:[n2,n3,n1]. >>>> >>>> Change this line in your code >>>> admin.addResource( clusterName, lockGroupName, 1, "LeaderStandby", >>> IdealStateModeProperty.AUTO_REBALANCE.toString() ); >>>> >>>> admin.rebalance( clusterName, lockGroupName, numInstances ); >>>> >>>> to >>>> >>>> admin.addResource( clusterName, lockGroupName, 1, "LeaderStandby", >>> IdealStateModeProperty.AUTO.toString() ); >>>> >>>> admin.rebalance( clusterName, lockGroupName, numInstances ); >>>> >>>> >>>> // if you want to change the current leader, you can do the >>> following. >>>> >>>> i >>>> dealState = admin.getResourceIdealState(String clusterName, String >>> resourceName); >>>> >>>> List preferenceList; //set the newleader you want as the first entry >>>> >>>> idealState.getRecord().setListField(partitionName,preferenceList); >>>> >>>> admin.addResource(String clusterName,String resourceName, IdealState >>> idealstate) >>>> >>>> >>>> >>>> Read more about the different execution modes >>>> >>>> http://helix.incubator.apache.org/Concepts.html and >>>> >>>> http://helix.incubator.apache.org/Features.html >>>> >>>> >>>> Thanks, >>>> >>>> Kishore G >>>> >>>> >>>> >>>> On Thu, May 16, 2013 at 11:09 PM, Alexandre Porcelli >>> <[email protected]> wrote: >>>> Hello all, >>>> >>>> Sorry to revamp this thread, but I think I'll have to ask again... >>> is it possible to force, via an api call, a transition from Leader to >>> "Wait" without disable an instance or partition? The transition from >>> Leader to Offline triggered by the disabled partition is causing me >>> some troubles... >>>> The main problem is that my transition from "Offline" to "Standby" >>> syncs data with the rest of the cluster (an expensive task, that should >>> be executed only if that node was really offline, in other words: there >>> was a partition, the node crashed or whatever). >>>> >>>> I predict that I may need build my own transition model... not sure >>> (not even sure on how to do it and be able to control/expose that >>> transition from Leader to "Wait")... >>>> >>>> Well... any help/suggestion is really welcomed! >>>> >>>> Cheers, >>>> --- >>>> Alexandre Porcelli >>>> [email protected] >>>> >>>> On May 2, 2013, at 2:26 PM, Alexandre Porcelli <[email protected]> >>> wrote: >>>> >>>>> Hi Vinayak, >>>>> >>>>> You were right, all my mistake! Disabling the partition works like >>> a charm! Thank you very much. >>>>> >>>>> Regards, >>>>> --- >>>>> Alexandre Porcelli >>>>> [email protected] >>>>> >>>>> On May 2, 2013, at 1:22 PM, Vinayak Borkar <[email protected]> >>> wrote: >>>>> >>>>>> Looking at the signature of HelixAdmin.enablePartition, I see this: >>>>>> >>>>>> void enablePartition(boolean enabled, >>>>>> String clusterName, >>>>>> String instanceName, >>>>>> String resourceName, >>>>>> List<String> partitionNames); >>>>>> >>>>>> >>>>>> >>>>>> So when you disable the partition, you are doing so only on a >>> perticular instance. So my understanding is that the same partition at >>> other instances will participate in an election to come out of standby. >>>>>> >>>>>> Vinayak >>>>>> >>>>>> >>>>>> On 5/2/13 9:14 AM, Alexandre Porcelli wrote: >>>>>>> Hi Vinayak, >>>>>>> >>>>>>> Thanks for your quick answer, but I don't think this would be >>> the case... once the partition `represents` the locked resource, so If i >>> disable it no other instance in the cluster will be able to be promoted >>> to Leader (at this point other nodes should be in standby just waiting >>> to be able to acquire the lock - in other words, become Leader). >>>>>>> Anyway thanks for your support. >>>>>>> >>>>>>> Cheers, >>>>>>> --- >>>>>>> Alexandre Porcelli >>>>>>> [email protected] >>>>>>> >>>>>>> >>>>>>> On May 2, 2013, at 1:06 PM, Vinayak Borkar <[email protected]> >>> wrote: >>>>>>> >>>>>>>>> >>>>>>>>> 1. I'm using a LeaderStandby in order to build a single global >>> lock on my cluster, it works as expected.. but in order to release the >>> lock I have to put the current leader in standby... I could achieve this >>> by disabling the current instance. It works, but doing this I loose (at >>> least seems to be) the ability to send/receive user defined messages. >>> I'd like to know if it's possible to, via an api call, force a >>> transition from Leader to Standby without disable an instance. >>>>>>>> >>>>>>>> I am a newbie to Helix too and I had a similar question a few >>> days ago. Have you looked into disabling the resource by using the >>> disablePartition() call in HelixAdmin using a partition number of 0? >>> This should disable just the resource without impacting the instance. >>>>>>>> >>>>>>>> Vinayak >>>>>>>> >>>>>>>>> >>>>>>>>> 2. I've been taking a quick look on Helix codebase, more >>> specific on ZooKeeper usage. Seems that you're using ZooKeeper as a >>> default implementation, but Helix architecture is not tied to it, right? >>> I'm asking this, because I'm interested to implement (in a near future) >>> a different backend (Infinispan). >>>>>>>>> >>>>>>>>> That's it for now... thanks in advance. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> --- >>>>>>>>> Alexandre Porcelli >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >
