Thank you for your advise Santiago. That is certainly part of the design as well.
Best, Lance On Fri, Jun 14, 2013 at 5:32 PM, Santiago Perez <[email protected]>wrote: > Helix user here (not developer) so take my words with a grain of salt. > > Regarding 6 you might want to consider the behavior of the node.js > instance if that instance loses connection to zk, you'll probably want to > kill it too, otherwise you could ignore the fact that the JVM lost the > connection too. > > Regards, > Santiago > > > On Fri, Jun 14, 2013 at 6:30 PM, Lance Co Ting Keh <[email protected]> wrote: > >> We have a working prototype of basically something like #2 you proposed >> above. We're using the standard helix participant, and on the @Transitions >> of the state model send commands to node.js via Http. >> >> I want to run you through our general architecture to make sure we are >> not violating anything on the Helix side. As a reminder, what we need to >> guarantee is that an any given time one and only one node.js process is in >> charge of a task. >> >> 1. A machine with N cores will have N (pending testing) node.js processes >> running >> 2. Associated with each of the N node processes are also N Helix >> participants (separate JVM instances -- reason for this to come later) >> 3. Separate helix controller will be running on the machine and will just >> leader elect between machines. >> 4. The spectator router will likely be HAProxy and thus a linux kernel >> will run JVM to serve as Helix spectator >> 5. The state machine for each will simply be ONLINEOFFLINE mode. (however >> i do get error messages that say that i havent defined an OFFLINE to >> DROPPED mode, i was going to ask you this but this is a minor detail >> compared to the rest of the architecture) >> 5. Simple Bash script will serve as a watch dog on each node.js and helix >> participant pair. If any of the two are "dead" the other process must >> immediately be SIGKILLED, hence the need for one JVM serving as Helix >> Participant for every Node.js >> 6. Each node.js instance sets a watch on /LIVEINSTANCES straight to >> zookeeper as an extra safety blanket. If it finds that it is NOT in the >> liveinstances it likely means that its JVM participant lost its connection >> to Zookeeper, but the process is still running so the bash script has not >> terminated the node server. In this case the node server must end its own >> process. >> >> Thank you for all your help. >> >> Sincerely, >> Lance >> >> >> >> >> On Wed, Jun 12, 2013 at 9:07 PM, kishore g <[email protected]> wrote: >> >>> Hi Lance, >>> >>> Thanks for your interest in Helix. There are two possible approaches >>> >>> 1. Similar to what you suggested: Write a Helix Participant in non-jvm >>> language which in your case is node.js. There seem to be quite a few >>> implementations in node.js that can interact with zookeeper. Helix >>> participant does the following ( you got it right but i am providing right >>> sequence) >>> >>> 1. Create an ephemeral node under LIVEINSTANCES >>> 2. watches /INSTANCES/<PARTICIPANT_NAME>/MESSAGES node for >>> transitions >>> 3. After transition is completed it updates >>> /INSTANCES/<PARTICIPANT_NAME>/CURRENTSTATE >>> >>> Controller is doing most of the heavy lifting of ensuring that these >>> transitions lead to the desired configuration. Its quite easy to >>> re-implement this in any other language, the most difficult thing would be >>> zookeeper binding. We have used java bindings and its solid. >>> This is at a very high level, there are some more details I have left >>> out like handling connection loss/session expiry etc that will require some >>> thinking. >>> >>> >>> 2. The other option is to use the Helix-agent as a proxy: We added Helix >>> agent as part of 0.6.1, we havent documented it yet. Here is the gist of >>> what it does. Think of it as a generic state transition handler. You can >>> configure Helix to run a specific system command as part of each >>> transition. Helix agent is a separate process that runs along side your >>> actual process. Instead of the actual process getting the transition, Helix >>> Agent gets the transition. As part of this transition the Helix agent can >>> invoke api's on the actual process via RPC, HTTP etc. Helix agent simply >>> acts as a proxy to the actual process. >>> >>> I have another approach and will try to write it up tonight, but before >>> that I have few questions >>> >>> >>> 1. How many node.js servers run on each node one or >1 >>> 2. Spectator/router is java or non java based ? >>> 3. Can you provide more details about your state machine. >>> >>> >>> thanks, >>> Kishore G >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Wed, Jun 12, 2013 at 11:07 AM, Lance Co Ting Keh <[email protected]>wrote: >>> >>>> Hi my name is Lance Co Ting Keh and I work at Box. You guys did a >>>> tremendous job with Helix. We are looking to use it to manage a cluster >>>> primarily running Node.js. Our model for using Helix would be to have >>>> node.js or some other non-JVM library be *Participants*, a router as a >>>> *Spectator* and another set of machines to serve as the *Controllers >>>> *(pending >>>> testing we may just run master-slave controllers on the same instances as >>>> the Participants) . The participants will be interacting with Zookeeper in >>>> two ways, one is to receive helix state transition messages through the >>>> instance of the HelixManager <Participant>, and another is to directly >>>> interact with Zookeeper just to maintain ephemeral nodes within /INSTANCES. >>>> Maintaining ephemeral nodes directly to Zookeeper would be done instead of >>>> using InstanceConfig and calling addInstance on HelixAdmin because of the >>>> basic health checking baked into maintaining ephemeral nodes. If not we >>>> would then have to write a health checker from Node.js and the JVM running >>>> the Participant. Are there better alternatives for non-JVM Helix >>>> participants? I corresponded with Kishore briefly and he mentioned >>>> HelixAgents specifically ProcessMonitorThread that came out in the last >>>> release. >>>> >>>> >>>> Thank you very much! >>>> >>>> Lance Co Ting Keh >>>> >>> >>> >> >
