Hi Erick, Sorry for the late response. I also like the 2nd option with the way 3 best, although it might take more time to address you immediate needs. Going down the path of passing zkstr is a good starting point, as I believe that the work being done could be ported into Twill later
Terence Sent from my iPhone > On Apr 24, 2014, at 8:44 AM, Erick Tryzelaar <[email protected]> > wrote: > > Good morning all, > > I wanted to run an idea by you all. I'm currently working on using Twill to > schedule GraphLab (http://graphlab.org), which is a distributed graph > analytics package written in C++. They currently use MPI, but it's only to > coordinate the launch of a cluster, so it should be comparatively easy to > migrate them over to YARN and Twill. In order to do this, I would like to > add to Twill some mechanism to allow me to request: > > 1. Request X containers > 2. Wait for the first container to be assigned to me. > 3. Wait Y seconds for the rest of the containers to be assigned to me. > 4. If the number of containers allocated equals X continue, otherwise > release my containers and go to step 1. > > I can think of two main ways to implement this, one inside Twill itself, > and one inside the application. > > 1. Modify `YarnAMClient.doRun` to block launching the processes until all > the containers have been allocated. > 2. Add some sort of distributed barrier that the application could use to > block until all the containers have been allocated. > > I'm leaning towards the second option, as Zookeeper and Curator already > implement distributed barriers. > > so all that's left is figuring out what's the right API to expose this. I > have a couple ideas for this: > > 1. Pass the Zookeeper connection string to the `TwillRunnable`. This would > be simplest as I wouldn't have to modify Twill, but then I would have a > redundant connection to Zookeeper. > 2. Expose the Zookeeper client to `TwillContext`. This would be simpler, > but then we'd be tightly coupling the Twill API to only work with Zookeeper. > 3. Draw inspiration from service discovery and add a > `SynchronizationService`, a `Barrier` interface, and a > `TwillContext.createBarrier(String)` method. It would use Zookeeper or > Curator under the covers. This would be a bit more work, but could be > useful for a lot of other applications. It also would be a nice place to > put other synchronization primitives. > > My plan right now is to start off with passing the Zookeeper connection > string to the `TwillRunnable`. Once I get that working I'd like to try to > implement the `SynchronizationService`. Does this sound like a good plan, > or would any of you suggest a better approach for implementing this? > > Thanks, > Erick
