Hi Keith, We tried to use Twill to launch PrestoDB on YARN, which has a lot of dependencies by itself and some of those are not version compatible with the one used in Twill. We created a bundle jar launcher (see https://github.com/apache/incubator-twill/blob/master/twill-examples/yarn/src/main/java/org/apache/twill/example/yarn/BundledJarExample.java) to solve the problem. Basically we use a TwillRunnable that creates a Classloader purely from the application jar (in bundle jar format) and loads the application class from it to launch the application. Maybe it is something that you can expand on to fit your need. Thoughts?
Terence On Sun, Mar 15, 2015 at 8:31 AM, Keith Turner <[email protected]> wrote: > On Sun, Mar 15, 2015 at 9:31 AM, Steve Loughran <[email protected]> > wrote: > >> >> > On 3 Mar 2015, at 21:14, Keith Turner <[email protected]> wrote: >> > >> > But that is not what prompted this discussion. Fluo depends on Accumulo >> > and Hadoop. Currently Fluo uses maven to build its complete runtime >> > classpath (w/ maven its easy to exclude things like log4j). This is >> > problematic in the case where the user builds Fluo with version X of >> hadoop >> > and has version Y running on their cluster. I am looking into making >> the >> > fluo scripts build the runtime classpath using the installed software, >> with >> > something like the following. >> > >> > FLUO_CLASSPATH=$FLUO_HOME/lib/*:$ACCUMULO_HOME/lib/*:`hadoop classpath` >> > >> > Using this method `hadoop classpath` brings in log4j and slf4j-log4j >> which >> > makes slf4j unhappy, because twill brings in logback slf4j bindings. >> >> The trend in YARN apps is to distribute their entire set of dependencies, >> pulling in only the hadoop conf dirs to their classpath. >> > > Thats what I would like to do, I have not had time to look into the > particulars. Is it possible launch an application using twill thats > completely independent of twill/yarn dependencies? Of course the code > doing the launching will depend on Twill/yarn and thats fine, Fluo has a > separate maven module (the cluster module) w/ its own deps for launching. > The Fluo core module has no deps twill/yarn. > > >> There's mixed benefits here >> >> good: >> -isolation of dependencies >> -100% confidence your hadoop APs are in sync >> -works in clusters in which the nodes do rolling upgrades & different >> parts of the cluster can be running different versions of hadoop at the >> same time. >> >> bad: >> -more stuff to upload to the distributed cache >> -your binaries aren't the same as the clusters, especially if they are >> not ASF clusters but things built by other people. You can fix that through >> the use of different mvn repos at build time, but then you have to build >> things >> -even with different classpaths, you all share the same native binaries. >> Try to run a 2.6 app on a 2.4 cluster and things that call native code may >> have link problems. >> >> >>
