Hi Kishore,
I think these discussions give me a great starting point to begin
integration with Helix. Thanks for taking the time to answer all my
questions.
Vinayak
On 2/28/13 11:40 PM, kishore g wrote:
Hi Vinayak,
This is how you can make the operation idempotent, irrespective of 1) 2)
or 3) you always use clustermessagingservice first to check if any node
already has the data and you can pull from them. If you dont get response
from other nodes in the system, then you fall back to 2) and use the data
you already have else you assume you are creating the resource for the
first time. Search system at LinkedIn used to do this earlier but now they
kind of dont have this requirement since they assume the indexes are
available when the node starts. I might be wrong here, but anyhow you get
the idea.
Messaging infrastructure is generic enough but not intended to be a RPC
mechanism between nodes. They communicate via zookeeper, so cant really be
used to achieve high throughput/latency. Again the reason here is allows
messages to be persistent across node restarts to ensure every node
processed the message. In case where you dont need such persistence its
possible to extend the messaging service to do RPC between nodes.
Shirshanka is developing a helix container module based on netty that might
allow one to extend the messaging service to do RPC within the cluster and
will be useful to transfer files between nodes efficiently without getting
the data into application memory by using sendfile api. Its actually a good
utility and I can see it being used in multiple systems.
Thanks,
Kishore G
On Thu, Feb 28, 2013 at 9:44 AM, Vinayak Borkar <[email protected]> wrote:
Will this address your problem, we dont have distinct actions based on
ERROR codes that controller will understand and take different actions.
Were you looking for something like that ?
I will need to think more about this. I think the retry mechnism might be
good enough for now.
Good point on not differentiating if the partition once existed v/s newly
created. We actually plan to modify the drop notification
behavior. Jason/Terence are discussing about this in another thread.
Please
add your suggestion to that thread. We should probably have a create and
drop method(not transition) on the participants.
Currently, how do other systems that use Helix handle the bootstrapping
process? When a resource is created for the first time, the actions of a
participant are different as compared to other times when a resource
partition is expanded to use another instance. Specifically, there are
three cases that need to be handled with respect to bootstrapping:
1. A cluster is up and running, and a new resource is created and
rebalanced.
2. A cluster that had resources is being started after being shutdown
3. A cluster is running and a resource is already laid out on the cluster.
Then some partitions are moved to instances that previously did not have
any partitions of that resource.
I looked through the examples and found the ClusterMessagingService
interface that can be used to send messages to instances in the cluster. I
can see 3 can be handled by using the messaging infrastructure. However,
both 1 and 2 will have the resource partitions start in the OFFLINE mode.
The messaging API cannot help because all instances in the cluster are in
the same boat for a particular resource in case 1 and case 2. So what is
the preferred way to know if you are in case 1 or in case 2? One way I see
is that if you have local artifacts matching the partitions that are
transiting from OFFLINE -> SLAVE mode, one could infer it is case 2. Is
that how other systems solve this issue?
On a separate note, is the messaging infrastructure general purpose? As in
can that be used by applications to perform RPC in the cluster obviating
the need for a separate RPC mechanism like Avro? I can see that the handler
will need more code than one would need to write when using Avro to get RPC
working, but my question is about the design point of the messaging
infrastructure.
Thanks,
Vinayak