Re: helix rebalancing for multiple resources

kishore g Fri, 03 Jan 2014 08:33:40 -0800

Hi Vu,

Currently, Helix does not have the ability to take zookeeper client from
outside. Its possible to add that feature but I need to think more about
the zookeeper state changes like disconnect/connect, session expiry etc.


Looks like getting the zk host/ports from your platform and passing it to
Helix is a possible option for now. Meanwhile, we will look into what it
takes to accept a zookeeper client as input.

Regarding the rebalancing for multiple resources, of the options Kanak
provided, start with #2 first and then implement #1 using USER defined
rebalancer. This functionality is generic enough that we can provide a
default implemention in Helix or if you implement one we can add it to
helix-core.

Let us know if you need help on implementing a rebalancer that works across
resources.

Another question is what is the expected behavior when a node fails, will
you have stand by nodes to pick up the task or assign it to a node that is
already running another task.

thanks,
Kishore G


On Fri, Jan 3, 2014 at 12:14 AM, Vu Nguyen <[email protected]> wrote:

> The main issue is that we already have an infrastructure here for
> ZooKeeper that has a separate mechanism for clients to discover the ZK
> server hosts.  That's provided by our platform team.  So client
> applications don't actually provide the ZooKeeper hosts at this point.  I
> likely could get access to that information somehow, though.  However, I
> would prefer to re-use what our platform team provides in case they make
> any modifications to how hosts are discovered.
>
> By using our platform libraries, we get a ZooKeeper client that's ready to
> use directly.  I was thinking that we could get Helix to use this for any
> ZooKeeper operations.  If we get disconnected from ZooKeeper, the discovery
> mechanism would be re-used automatically for reconnecting without requiring
> us to explicitly providing the hosts/ports.
>
> Thanks
>
> Vu
>
>
>
>
>
>
> On Wed, Jan 1, 2014 at 9:26 PM, Kanak Biscuitwala <[email protected]>wrote:
>
>> Not sure I follow. Is your problem that Helix creates the cluster as a
>> child of the root node (e.g. /clusterName) while you would like it to be
>> something else (e.g. /path/to/custom/root/clusterName)?
>>
>> I'm also unclear about what you mean about discovering ZK servers. How
>> would you be able to leverage a path in ZK to discover ZK?
>>
>> Right now Helix requires long-running ZK servers and assumes that you as
>> the application know how to connect to them (i.e. you know the
>> hosts/ports). If that assumption holds, I believe it should work
>> independent of deployment (cloud provider, private datacenter, or anything
>> else).
>>
>> I'm not really sure what you're trying to adapt with the adapter. Could
>> you clarify?
>>
>> I'm on #apachehelix on freenode if that's more convenient.
>>
>> Thanks,
>> Kanak
>> ------------------------------
>> Date: Wed, 1 Jan 2014 21:07:36 -0800
>> Subject: Re: helix rebalancing for multiple resources
>> From: [email protected]
>> To: [email protected]
>> CC: [email protected]
>>
>>
>> Yes, that is helpful.
>>
>> Another big requirement that I forgot to mention is running this on a
>> cloud service provider, like AWS.  We already have shared zookeeper setup
>> there with our own client.  Ideally, I could inject a custom client for
>> helix to use for operations, where the main differences we would require is
>> a custom top level path (/appname) that is required by our client, and that
>> would handle discovering and connecting to the zookeeper servers.
>>
>> Is support for AWS and other cloud providers on the roadmap?
>>
>> Also, for the short-term, do you see any complications in us creating an
>> adapter client that helix would use to bridge that gap?  Or would it be
>> much more complicated than I am hoping for?
>>
>> Thanks
>>
>> Vu
>>
>>
>>
>>
>>
>>
>> On Wed, Jan 1, 2014 at 8:36 PM, Kanak Biscuitwala <[email protected]>wrote:
>>
>> Resending since I realized you might not be registered on the user list
>> yet. By the way, for your specific use case, I would personally lean
>> towards the CustomCodeRunner along with the CUSTOMIZED IdealState rebalance
>> mode. Then when nodes enter and exit, you can change the IdealState
>> yourself and Helix will fire the transitions. This will most easily give
>> you the policy-driven global view you're looking for.
>>
>> ---
>>
>> Hi Vu,
>>
>> Your understanding is basically correct. The controller will rebalance
>> each resource in sequence, at most one controller pipeline execution is
>> going on at any one time, and there is no parallelism within the controller
>> pipeline (other than batch reading and writing the cluster at the beginning
>> and end).
>>
>> Here are some things that may be of use to know:
>>
>> 1. You can plug in your own code to help decide how to rebalance your
>> cluster in one of two ways:
>>    - Using the CustomCodeRunner on the participant side so that you can
>> update the IdealState whenever the cluster changes:
>> https://github.com/apache/incubator-helix/blob/helix-0.6.2-release/helix-core/src/main/java/org/apache/helix/participant/HelixCustomCodeRunner.java?source=c
>>    - Implementing a Rebalancer with USER_DEFINED rebalance mode:
>> https://github.com/apache/incubator-helix/blob/helix-0.6.2-release/helix-core/src/main/java/org/apache/helix/controller/rebalancer/Rebalancer.java?source=c
>>
>> In either case, Helix will still fire transitions according to
>> constraints and react to node entry/exit.
>>
>> 2. Helix supports adding tags to nodes (via InstanceConfig), and
>> specifying tags in each resource IdealState. Then, a tagged resource will
>> only be assigned to nodes with the corresponding tag present.
>>
>> 3. You can specify max partitions per resource per node in the IdealState
>> of the resource (this should be 1 in your case)
>>
>> 4. You can combine any of the above 3 if that makes sense (e.g. change
>> node tags whenever a cluster change happens, thus constraining how Helix
>> will assign everything)
>>
>> Is that helpful?
>>
>> Kanak
>> ------------------------------
>> Date: Wed, 1 Jan 2014 20:31:56 -0800
>> Subject: helix rebalancing for multiple resources
>> From: [email protected]
>> To: [email protected]
>>
>>
>> Hi,
>> We're looking into creating something like a distributed task processing
>> cluster.  We already have existing code for the processing task on a single
>> host.  So that results in stronger restrictions on what we're doing:
>> - partitioned task A: single partition needs to be assigned to a single
>> node and a node may have only a single partitioned task
>> - another set of non-partitioned tasks (e.g. B, C, D) also needs to be
>> assigned nodes, but it would be most efficient of those tasks are assigned
>> to separate nodes so any single node has at most 1 task (either partitioned
>> A, B, C, D, etc.)
>>
>> This seems to require a global view of a tasks.  However, from the
>> examples and the Rebalancer code, it appears that the resource
>> mappings/assignments are independent of each another.  Is that correct?  If
>> so, is Apache Helix the right framework for us, given the requirements
>> above?
>>
>> I saw that it might be possible to find the current resource assignment
>> for other resources during the rebalancing calculation methods, but I was
>> then concerned about concurrency issues--if the rebalance for task A and
>> rebalance for B was computed at the same time.
>>
>> Thanks for any and all feedback.
>>
>> Vu Nguyen
>>
>>
>>
>

Re: helix rebalancing for multiple resources

Reply via email to