Re: State transitions of partitions

Vinayak Borkar Thu, 28 Feb 2013 09:45:31 -0800


Will this address your problem, we dont have distinct actions based on
ERROR codes that controller will understand and take different actions.
Were you looking for something like that ?

I will need to think more about this. I think the retry mechnism mightbe good enough for now.


Good point on not differentiating if the partition once existed v/s newly
created.  We actually plan to modify the drop notification
behavior. Jason/Terence are discussing about this in another thread. Please
add your suggestion to that thread. We should probably have a create and
drop method(not transition) on the participants.

Currently, how do other systems that use Helix handle the bootstrappingprocess? When a resource is created for the first time, the actions of aparticipant are different as compared to other times when a resourcepartition is expanded to use another instance. Specifically, there arethree cases that need to be handled with respect to bootstrapping:

1. A cluster is up and running, and a new resource is created andrebalanced.

2. A cluster that had resources is being started after being shutdown

3. A cluster is running and a resource is already laid out on thecluster. Then some partitions are moved to instances that previously didnot have any partitions of that resource.

I looked through the examples and found the ClusterMessagingServiceinterface that can be used to send messages to instances in the cluster.I can see 3 can be handled by using the messaging infrastructure.However, both 1 and 2 will have the resource partitions start in theOFFLINE mode. The messaging API cannot help because all instances in thecluster are in the same boat for a particular resource in case 1 andcase 2. So what is the preferred way to know if you are in case 1 or incase 2? One way I see is that if you have local artifacts matching thepartitions that are transiting from OFFLINE -> SLAVE mode, one couldinfer it is case 2. Is that how other systems solve this issue?

On a separate note, is the messaging infrastructure general purpose? Asin can that be used by applications to perform RPC in the clusterobviating the need for a separate RPC mechanism like Avro? I can seethat the handler will need more code than one would need to write whenusing Avro to get RPC working, but my question is about the design pointof the messaging infrastructure.




Thanks,
Vinayak

Re: State transitions of partitions

Reply via email to