Hi Gary, Thank you for this information. Using the command you mentioned, I am not able to add new node to my setup. In my scenario, I have 3 nodes say A, B, C.
A is the Master node where ResourceManager is running and B and C are slave nodes. So by default when I execute "start-yarn.sh", NodeManager processes gets started in A, B and C nodes. for * "yarn rmadmin -refreshNodes"* command, I have introduced property *"yarn.resourcemanager.nodes.include-path*" in yarn-site.xml which points to a file which talks about only Node B. So after executing yarn "rmadmin -refreshNodes" command, as expected "NodeManager" process is running only on B node. Now I *want to add node "C"*, so I updated yarn.resourcemanager.nodes.include-path to include C and re-ran "yarn rmadmin -refreshNodes" , but I am not seeing "NodeManager" process getting executed i.e. its not picking up new Node C. Could you please let me know if I am doing anything wrong in adding new Node following above process.? Thanks for your help again, Srini On Tue, May 20, 2014 at 6:32 PM, Gary Helmling <[email protected]> wrote: > Pardon me, for a YARN cluster, that should actually be "yarn rmadmin > -refreshNodes". > > See this page for usage details: > > http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/YarnCommands.html > > > On Tue, May 20, 2014 at 6:28 PM, Gary Helmling <[email protected]> > wrote: > > > Hi Srinivas, > > > > If you're trying to remove a node from your YARN cluster, you'll need to > > run "hadoop mradmin -refreshNodes". The command that you ran ("hadoop > > dfsadmin") is for adding/removing nodes from the HDFS service. > > > > > > On Tue, May 20, 2014 at 4:56 PM, Srinivas Reddy Kancharla < > > [email protected]> wrote: > > > >> Thanks Terence for your clarification. > >> I tried to remove the node from the cluster by removing an entry from > >> "slaves" file and then ran "hadoop dfsadmin -refreshNodes" but looks > like > >> this is not the right command. > >> Is there any specific command I need to use to remove or add a node , > >> *without > >> restarting the services* ? > >> > >> I coded in such a way that new Runnables should get launched if they see > >> new nodes but I am stuck with basic yarn command. > >> > >> Thanks and regards, > >> Srini > >> > >> > >> On Tue, May 20, 2014 at 12:02 AM, Terence Yim <[email protected]> wrote: > >> > >> > Hi Srinivas, > >> > > >> > Sorry for the late reply. BTW, I just noticed that this discussion is > >> > not on the dev@ mailing list, hence I CC my reply to the mailing > list. > >> > You could subscribe to the list by sending an email to > >> > [email protected] > >> > > >> > To your question about rebalancing, currently Twill won't stop > >> > executing Runnable and move it to run on newly available resource, as > >> > it doesn't know what the Runnable is doing and whether it is close to > >> > finish or not. After you added a new node to the cluster, only newly > >> > launched runnable (either a new application run or increase number of > >> > instances of existing runnable) may runs on the new node (up to YARN > >> > to allocate). > >> > > >> > Terence > >> > > >> > > >> > On Fri, May 16, 2014 at 1:31 PM, Srinivas Reddy Kancharla > >> > <[email protected]> wrote: > >> > > HI Terence, > >> > > > >> > > Thanks for the information you have provided and now I could execute > >> my > >> > > programs. I am trying to experiment on re-balance behavior, your > input > >> > will > >> > > really help me to test further: > >> > > > >> > > - I created a my own TwillApplication which launches 3 > >> > > AbstractTwillRunnables (say this program is time consuming job). > >> > > - I have a setup of 3 nodes (one master and 2 slave). When I launch > my > >> > > program, I could see that : > >> > > > First slave node has launched ApplicationMaster and one > >> Runnable. > >> > > > Second Slave node has taken care of launch other 2 runnables. > >> > > > >> > > - During execution of above application, If I add 3rd slave node to > >> > cluster, > >> > > and configure it for re-balance, will this re-balance process will > >> take > >> > care > >> > > of re-distributing of runnables again? i.e. now in this scenario > the > >> > > second slave node will have only one runnable and third new slave > node > >> > > should take care of one of the runnable. This way the load is > >> > distributed. > >> > > > >> > > Thanks and regards, > >> > > Srini > >> > > > >> > > > >> > > > >> > > > >> > > On Fri, May 9, 2014 at 12:15 AM, Terence Yim <[email protected]> > >> wrote: > >> > >> > >> > >> Hi Srinivas, > >> > >> > >> > >> First of all, through I never tried, I won't expect a YARN app > could > >> > >> work correctly on local cluster after computer sleep and wake. > >> > >> > >> > >> The exception is about RM tries to restart the AM after wake up > >> (maybe > >> > >> it though the AM is dead, as it has't been heartbeating when the > >> > >> computer sleep, and RM uses wall clock to check), however the > restart > >> > >> failed due to token expiration (when someone asked RM for a > >> container, > >> > >> it comes with a timed token). The expiration time is governed by > the > >> > >> setting > >> > "yarn.resourcemanager.rm.container-allocation.expiry-interval-ms" > >> > >> and default is 600 seconds. > >> > >> > >> > >> Terence > >> > >> > >> > >> On Thu, May 8, 2014 at 11:45 AM, Srinivas Reddy Kancharla > >> > >> <[email protected]> wrote: > >> > >> > HI Terence, > >> > >> > > >> > >> > Yesterda the same program was working. Today when I opened my > >> MacBook > >> > >> > and so > >> > >> > my 3 VM nodes are running back, I am seeing below exception as > >> shown: > >> > >> > > >> > >> > I am getting below exception, is there any configuration which > can > >> > >> > ignore > >> > >> > such exception??: > >> > >> > > >> > >> > Got exception: org.apache.hadoop.yarn.exceptions.YarnException: > >> > >> > Unauthorized > >> > >> > request to start container. > >> > >> > This token is expired. current time is 1399573775978 found > >> > 1399573627677 > >> > >> > > >> > >> > 2014-05-08 11:17:07,682 INFO > >> > >> > > >> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > >> > >> > Command > >> > >> > to launch container container_1399572736534_0002_02_000001 : > >> > >> > $JAVA_HOME/bin/java -Djava.io.tmpdir=tmp > >> -Dyarn.appId=$YARN_APP_ID_STR > >> > >> > -Dtwill.app=$TWILL_APP_NAME -cp launcher.jar:$HADOOP_CONF_DIR > >> -Xmx362m > >> > >> > org.apache.twill.launcher.TwillLauncher appMaster.jar > >> > >> > org.apache.twill.internal.appmaster.ApplicationMasterMain false > >> > >> > 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr > >> > >> > 2014-05-08 11:17:07,694 INFO > >> > >> > > >> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > >> > >> > Error > >> > >> > launching appattempt_1399572736534_0002_000002. Got exception: > >> > >> > org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized > >> request > >> > to > >> > >> > start container. > >> > >> > This token is expired. current time is 1399573775978 found > >> > 1399573627677 > >> > >> > at > >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > >> > >> > Method) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > >> > >> > at > >> java.lang.reflect.Constructor.newInstance(Constructor.java: > >> > >> > 534) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:122) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >> > >> > at java.lang.Thread.run(Thread.java:701) > >> > >> > > >> > >> > 2014-05-08 11:17:07,695 INFO > >> > >> > > >> > > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: > >> > >> > Unregistering app attempt : appattempt_1399572736534_0002_000002 > >> > >> > 2014-05-08 11:17:07,695 INFO > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > >> > >> > appattempt_1399572736534_0002_000002 State change from ALLOCATED > to > >> > >> > FAILED > >> > >> > 2014-05-08 11:17:07,695 INFO > >> > >> > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > >> > >> > Application > >> > >> > application_1399572736534_0002 failed 2 times due to Error > >> launching > >> > >> > appattempt_1399572736534_0002_000002. Got exception: > >> > >> > org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized > >> request > >> > to > >> > >> > start container. > >> > >> > This token is expired. current time is 1399573775978 found > >> > 1399573627677 > >> > >> > at > >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > >> > >> > Method) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > >> > >> > at > >> > java.lang.reflect.Constructor.newInstance(Constructor.java:534) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:122) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >> > >> > at java.lang.Thread.run(Thread.java:701) > >> > >> > . Failing the application. > >> > >> > 2014-05-08 11:17:07,695 INFO > >> > >> > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > >> > >> > application_1399572736534_0002 State change from ACCEPTED to > FAILED > >> > >> > 2014-05-08 11:17:07,695 WARN > >> > >> > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: > >> > >> > USER=srini OPERATION=Application Finished - > >> > >> > Failed TARGET=RMAppManager RESULT=FAILURE > >> DESCRIPTION=App > >> > >> > failed with state: > >> > >> > FAILED PERMISSIONS=Application > >> application_1399572736534_0002 > >> > >> > failed 2 times > >> > >> > due to Error launching appattempt_1399572736534_0002_000002. Got > >> > >> > exception: > >> > >> > org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized > >> request > >> > to > >> > >> > start container. > >> > >> > This token is expired. current time is 1399573775978 found > >> > 1399573627677 > >> > >> > at > >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > >> > >> > Method) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > >> > >> > at > >> > java.lang.reflect.Constructor.newInstance(Constructor.java:534) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:122) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > >> > >> > at > >> > >> > > >> > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >> > >> > at java.lang.Thread.run(Thread.java:701) > >> > >> > > >> > >> > > >> > >> > > >> > >> > On Wed, May 7, 2014 at 1:35 PM, Srinivas Reddy Kancharla > >> > >> > <[email protected]> wrote: > >> > >> >> > >> > >> >> I got answer for one my own question: > >> > >> >> Can I expect "Hello world" on master node where I launched the > >> > program > >> > >> >> ? > >> > >> >> > >> > >> >> After I copied jopt-simple.jar of proper version, it worked and > I > >> can > >> > >> >> see > >> > >> >> "Hello world" output on master node. Sorry for the spam. > >> > >> >> > >> > >> >> Srini > >> > >> >> > >> > >> >> > >> > >> >> On Wed, May 7, 2014 at 1:12 PM, Srinivas Reddy Kancharla > >> > >> >> <[email protected]> wrote: > >> > >> >>> > >> > >> >>> Exciting.. it worked after I got all required jars. Advantage > of > >> not > >> > >> >>> using maven project is , faced all these issues and exposed to > >> all > >> > the > >> > >> >>> required jars and exceptions. > >> > >> >>> > >> > >> >>> Now when I launched my program, it got executed in one of my > >> slave > >> > >> >>> node. > >> > >> >>> Both application master and task ran on the same node and I > could > >> > see > >> > >> >>> "Hello > >> > >> >>> world" in "stdout" log. > >> > >> >>> > >> > >> >>> Can I expect "Hello world" on master node where I launched the > >> > program > >> > >> >>> ? > >> > >> >>> > >> > >> >>> Thanks again for all your help. From here I will try different > >> > >> >>> programs > >> > >> >>> with different options and will see how it goes. > >> > >> >>> > >> > >> >>> Is there any particular forum where I can ask questions or > >> should be > >> > >> >>> fine > >> > >> >>> to send you questions ? It was a great help from you. > >> > >> >>> > >> > >> >>> I am doing all this during my free time (i.e. after office > >> hours). I > >> > >> >>> would like to try more and so if possible please let me know > if I > >> > can > >> > >> >>> be > >> > >> >>> helpful in anyway. > >> > >> >>> > >> > >> >>> Regards, > >> > >> >>> Srini > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> On Wed, May 7, 2014 at 1:06 AM, Terence Yim <[email protected]> > >> > wrote: > >> > >> >>>> > >> > >> >>>> Hi Srinivas, > >> > >> >>>> > >> > >> >>>> It’s the ASM library version issue. Try to include the > >> > >> >>>> asm-4.0-all.jar > >> > >> >>>> in your classpath before the hadoop classpath. > >> > >> >>>> > >> > >> >>>> http://mvnrepository.com/artifact/org.ow2.asm/asm-all/4.0 > >> > >> >>>> > >> > >> >>>> Terence > >> > >> >>>> > >> > >> >>>> On May 6, 2014, at 4:22 PM, Srinivas Reddy Kancharla > >> > >> >>>> <[email protected]> wrote: > >> > >> >>>> > >> > >> >>>> Hi Terence, > >> > >> >>>> > >> > >> >>>> After all step-by-step downloading of required jar files > (b'cos > >> I > >> > am > >> > >> >>>> not > >> > >> >>>> using maven for now), I am able to pass through the zookeeper > >> issue > >> > >> >>>> (have a > >> > >> >>>> setup of 3 nodes i.e. one leader and 2 followers) and now I > am > >> > >> >>>> seeing below > >> > >> >>>> exception: (Any pointer for this would be helpful for me). > >> > >> >>>> > >> > >> >>>> I suspect on the hadoop libraries I am using, b'cos from the > pom > >> > >> >>>> files > >> > >> >>>> which you have created for hello world examples is referring > to > >> > >> >>>> hadoop 2.3 > >> > >> >>>> ... whereas I am using Hadoop 2.2 .. Do you think below > >> exception > >> > is > >> > >> >>>> due to > >> > >> >>>> that reason? > >> > >> >>>> > >> > >> >>>> > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:java.io.tmpdir=/tmp > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:java.compiler=<NA> > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:os.name=Linux > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:os.arch=amd64 > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:os.version=3.11.0-12-generic > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:user.name=srini > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:user.home=/home/srini > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Client > >> > >> >>>> environment:user.dir=/home/srini/twill/twilljars > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ZooKeeper: Initiating client > >> > >> >>>> connection, connectString=localhost:2181 sessionTimeout=10000 > >> > >> >>>> watcher=ServiceDelegate [STARTING] > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>> connection > >> > >> >>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > authenticate > >> > >> >>>> using > >> > >> >>>> SASL (unknown error) > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ClientCnxn: Socket connection > >> > >> >>>> established to localhost/127.0.0.1:2181, initiating session > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.ClientCnxn: Session > >> establishment > >> > >> >>>> complete on server localhost/127.0.0.1:2181, sessionid = > >> > >> >>>> 0x145d3a544bd0006, > >> > >> >>>> negotiated timeout = 10000 > >> > >> >>>> 14/05/06 15:53:39 INFO zookeeper.DefaultZKClientService: > >> Connected > >> > to > >> > >> >>>> ZooKeeper: localhost:2181 > >> > >> >>>> Exception in thread " STARTING" > >> > >> >>>> java.lang.IncompatibleClassChangeError: > >> > >> >>>> class > >> > >> >>>> > >> org.apache.twill.internal.utils.Dependencies$DependencyClassVisitor > >> > >> >>>> has interface org.objectweb.asm.ClassVisitor as super class > >> > >> >>>> at java.lang.ClassLoader.defineClass1(Native Method) > >> > >> >>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:643) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > >> > >> >>>> at > java.net.URLClassLoader.defineClass(URLClassLoader.java:277) > >> > >> >>>> at java.net.URLClassLoader.access$000(URLClassLoader.java:73) > >> > >> >>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:212) > >> > >> >>>> at java.security.AccessController.doPrivileged(Native Method) > >> > >> >>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > >> > >> >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:323) > >> > >> >>>> at > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > >> > >> >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:268) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.utils.Dependencies.findClassDependencies(Dependencies.java:102) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.ApplicationBundler.findDependencies(ApplicationBundler.java:179) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:136) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:106) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.yarn.YarnTwillPreparer.createAppMasterJar(YarnTwillPreparer.java:366) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.yarn.YarnTwillPreparer.access$2(YarnTwillPreparer.java:350) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:263) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:1) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.yarn.YarnTwillController.doStartUp(YarnTwillController.java:98) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.AbstractZKServiceController.startUp(AbstractZKServiceController.java:82) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.startUp(AbstractExecutionServiceController.java:109) > >> > >> >>>> at > >> > >> >>>> > >> > >> >>>> > >> > > >> > com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) > >> > >> >>>> at java.lang.Thread.run(Thread.java:701) > >> > >> >>>> > >> > >> >>>> > >> > >> >>>> Thanks and regards, > >> > >> >>>> Srini > >> > >> >>>> > >> > >> >>>> > >> > >> >>>> On Tue, May 6, 2014 at 2:40 PM, Srinivas Reddy Kancharla > >> > >> >>>> <[email protected]> wrote: > >> > >> >>>>> > >> > >> >>>>> Got it. I will do that and will update you. Earlier my > >> assumption > >> > >> >>>>> was > >> > >> >>>>> my hadoop cluster would be starting zookeeper as part of > >> Namenode, > >> > >> >>>>> Datanode, > >> > >> >>>>> resourcemanager , Nodemanager initialization. Seems like I > was > >> > wrong > >> > >> >>>>> and I > >> > >> >>>>> have to start zookeeper as a separate process. > >> > >> >>>>> > >> > >> >>>>> Thanks again for this information. > >> > >> >>>>> > >> > >> >>>>> Regards, > >> > >> >>>>> Srini > >> > >> >>>>> > >> > >> >>>>> > >> > >> >>>>> > >> > >> >>>>> On Tue, May 6, 2014 at 2:32 PM, Terence Yim < > [email protected]> > >> > >> >>>>> wrote: > >> > >> >>>>>> > >> > >> >>>>>> Hi Srinivas, > >> > >> >>>>>> > >> > >> >>>>>> Yes you'll need to start zookeeper manually before executing > >> the > >> > >> >>>>>> twill > >> > >> >>>>>> program. The assumption is that zookeeper is a long running > >> > service > >> > >> >>>>>> in the > >> > >> >>>>>> cluster > >> > >> >>>>>> > >> > >> >>>>>> Terence > >> > >> >>>>>> > >> > >> >>>>>> Sent from my iPhone > >> > >> >>>>>> > >> > >> >>>>>> On May 6, 2014, at 2:14 PM, Srinivas Reddy Kancharla > >> > >> >>>>>> <[email protected]> wrote: > >> > >> >>>>>> > >> > >> >>>>>> HI Terence, > >> > >> >>>>>> > >> > >> >>>>>> Thank you very much for the pointer. So i have used "hadoop > >> > >> >>>>>> classpath" > >> > >> >>>>>> command and copied that list to my "java" command and > atleast > >> > now I > >> > >> >>>>>> am out > >> > >> >>>>>> of classpath issues. So this shows that I am fine with my > >> current > >> > >> >>>>>> version of > >> > >> >>>>>> Hadoop 2.2 jars. > >> > >> >>>>>> > >> > >> >>>>>> Now as I asked in my previous mail, Do I need to start > >> > "zookeeper" > >> > >> >>>>>> separately or its a part of my existing hadoop running > >> cluster ?? > >> > >> >>>>>> b'cos I am > >> > >> >>>>>> getting below exception for my "Hello world" example (I have > >> > taken > >> > >> >>>>>> your > >> > >> >>>>>> example of "localhost:2181" for ZKServer string: > >> > >> >>>>>> > >> > >> >>>>>> > >> > >> >>>>>> > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > environment:java.library.path=/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk-amd64/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:java.io.tmpdir=/tmp > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:java.compiler=<NA> > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:os.name=Linux > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:os.arch=amd64 > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:os.version=3.11.0-12-generic > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:user.name=srini > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:user.home=/home/srini > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Client > >> > >> >>>>>> environment:user.dir=/home/srini/twill/twilljars > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ZooKeeper: Initiating > client > >> > >> >>>>>> connection, connectString=localhost:2181 > sessionTimeout=10000 > >> > >> >>>>>> watcher=ServiceDelegate [STARTING] > >> > >> >>>>>> 14/05/06 14:08:11 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>>>> connection > >> > >> >>>>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > >> >>>>>> authenticate using > >> > >> >>>>>> SASL (unknown error) > >> > >> >>>>>> 14/05/06 14:08:11 WARN zookeeper.ClientCnxn: Session 0x0 for > >> > server > >> > >> >>>>>> null, unexpected error, closing socket connection and > >> attempting > >> > >> >>>>>> reconnect > >> > >> >>>>>> java.net.ConnectException: Connection refused > >> > >> >>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> > >> >>>>>> 14/05/06 14:08:12 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>>>> connection > >> > >> >>>>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > >> >>>>>> authenticate using > >> > >> >>>>>> SASL (unknown error) > >> > >> >>>>>> 14/05/06 14:08:12 WARN zookeeper.ClientCnxn: Session 0x0 for > >> > server > >> > >> >>>>>> null, unexpected error, closing socket connection and > >> attempting > >> > >> >>>>>> reconnect > >> > >> >>>>>> java.net.ConnectException: Connection refused > >> > >> >>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> > >> >>>>>> 14/05/06 14:08:13 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>>>> connection > >> > >> >>>>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > >> >>>>>> authenticate using > >> > >> >>>>>> SASL (unknown error) > >> > >> >>>>>> 14/05/06 14:08:13 WARN zookeeper.ClientCnxn: Session 0x0 for > >> > server > >> > >> >>>>>> null, unexpected error, closing socket connection and > >> attempting > >> > >> >>>>>> reconnect > >> > >> >>>>>> java.net.ConnectException: Connection refused > >> > >> >>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> > >> >>>>>> 14/05/06 14:08:14 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>>>> connection > >> > >> >>>>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > >> >>>>>> authenticate using > >> > >> >>>>>> SASL (unknown error) > >> > >> >>>>>> 14/05/06 14:08:14 WARN zookeeper.ClientCnxn: Session 0x0 for > >> > server > >> > >> >>>>>> null, unexpected error, closing socket connection and > >> attempting > >> > >> >>>>>> reconnect > >> > >> >>>>>> java.net.ConnectException: Connection refused > >> > >> >>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> > >> >>>>>> 14/05/06 14:08:15 INFO zookeeper.ClientCnxn: Opening socket > >> > >> >>>>>> connection > >> > >> >>>>>> to server localhost/127.0.0.1:2181. Will not attempt to > >> > >> >>>>>> authenticate using > >> > >> >>>>>> SASL (unknown error) > >> > >> >>>>>> 14/05/06 14:08:15 WARN zookeeper.ClientCnxn: Session 0x0 for > >> > server > >> > >> >>>>>> null, unexpected error, closing socket connection and > >> attempting > >> > >> >>>>>> reconnect > >> > >> >>>>>> java.net.ConnectException: Connection refused > >> > >> >>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) > >> > >> >>>>>> at > >> > >> >>>>>> > >> > >> >>>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> > >> >>>>>> > >> > >> >>>>>> > >> > >> >>>>>> Thank you again for your help and hopefully once I am out of > >> this > >> > >> >>>>>> initial setup issues, I will not bother you much unless its > >> very > >> > >> >>>>>> technical. > >> > >> >>>>>> > >> > >> >>>>>> Thanks and regards, > >> > >> >>>>>> Srini > >> > >> >>>>>> > >> > >> >>>>>> > >> > >> >>>>>> On Mon, May 5, 2014 at 10:34 PM, Terence Yim < > >> [email protected]> > >> > >> >>>>>> wrote: > >> > >> >>>>>>> > >> > >> >>>>>>> Hi Srinivas, > >> > >> >>>>>>> > >> > >> >>>>>>> Looks like you missed some hadoop classes in your > classpath. > >> > >> >>>>>>> You’ll > >> > >> >>>>>>> need the hadoop classpath in your classpath. Have you try > >> > running > >> > >> >>>>>>> like this? > >> > >> >>>>>>> > >> > >> >>>>>>> HDCP=`hadoop classpath`; java -cp > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > ./SriniTwillYarnClasses.jar:twill-api-0.3.0-incubating-SNAPSHOT.jar:……:$HDCP > >> > >> >>>>>>> com.srini.hadoopTwill.HelloTwill > >> > >> >>>>>>> > >> > >> >>>>>>> Terence > >> > >> >>>>>>> > >> > >> >>>>>>> On May 5, 2014, at 9:07 PM, Srinivas Reddy Kancharla > >> > >> >>>>>>> <[email protected]> wrote: > >> > >> >>>>>>> > >> > >> >>>>>>> Hello Terence, > >> > >> >>>>>>> > >> > >> >>>>>>> I am Srini and new to twill. I am very sorry for sending > you > >> > email > >> > >> >>>>>>> like this, b'cos I could not find any other discussion > forum > >> to > >> > >> >>>>>>> post this > >> > >> >>>>>>> message. My bad, please let me know if there is a forum > exist > >> > and > >> > >> >>>>>>> I can get > >> > >> >>>>>>> some help in future instead of direct mails to you. Below > is > >> the > >> > >> >>>>>>> issue I am > >> > >> >>>>>>> facing while executing my first Twill program: > >> > >> >>>>>>> > >> > >> >>>>>>> - I have a setup of hadoop-2.2.0 which has total 3 nodes. > one > >> > >> >>>>>>> master > >> > >> >>>>>>> and 2 slave. > >> > >> >>>>>>> - I could execute the DistributedShell program > successfully. > >> > >> >>>>>>> - Now I downloaded twill project, generated required jar > >> files > >> > >> >>>>>>> using > >> > >> >>>>>>> mvn commands. > >> > >> >>>>>>> - I replicated Helloworld sample program and during > >> execution, I > >> > >> >>>>>>> am > >> > >> >>>>>>> getting below exception: > >> > >> >>>>>>> > >> > >> >>>>>>> srini@ubuntu:~/twill/twilljars$ java -classpath > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > ./SriniTwillYarnClasses.jar:twill-api-0.3.0-incubating-SNAPSHOT.jar:guava-13.0.1.jar:slf4j-api-1.7.7.jar:twill-yarn-0.3.0-incubating-SNAPSHOT.jar:hadoop-common-2.2.0.jar:hadoop-yarn-api-2.2.0.jar:twill-ext-0.3.0-incubating-SNAPSHOT.jar:twill-core-0.3.0-incubating-SNAPSHOT.jar:commons-logging-1.1.1.jar:commons-configuration-1.6.jar:commons-lang-2.5.jar:twill-common-0.3.0-incubating-SNAPSHOT.jar:twill-zookeeper-0.3.0-incubating-SNAPSHOT.jar:hadoop-auth-2.2.0.jar > >> > >> >>>>>>> com.srini.hadoopTwill.HelloTwill > >> > >> >>>>>>> > >> > >> >>>>>>> SLF4J: Failed to load class > >> "org.slf4j.impl.StaticLoggerBinder". > >> > >> >>>>>>> SLF4J: Defaulting to no-operation (NOP) logger > implementation > >> > >> >>>>>>> SLF4J: See > >> http://www.slf4j.org/codes.html#StaticLoggerBinderfor > >> > >> >>>>>>> further details. > >> > >> >>>>>>> May 5, 2014 8:49:53 PM > >> org.apache.hadoop.util.NativeCodeLoader > >> > >> >>>>>>> <clinit> > >> > >> >>>>>>> WARNING: Unable to load native-hadoop library for your > >> > platform... > >> > >> >>>>>>> using builtin-java classes where applicable > >> > >> >>>>>>> Exception in thread "main" java.lang.RuntimeException: > >> > >> >>>>>>> java.lang.reflect.InvocationTargetException > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> com.google.common.base.Throwables.propagate(Throwables.java:160) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > org.apache.twill.internal.yarn.VersionDetectYarnAppClientFactory.create(VersionDetectYarnAppClientFactory.java:47) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > org.apache.twill.yarn.YarnTwillRunnerService.<init>(YarnTwillRunnerService.java:143) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > org.apache.twill.yarn.YarnTwillRunnerService.<init>(YarnTwillRunnerService.java:138) > >> > >> >>>>>>> at > com.srini.hadoopTwill.HelloTwill.main(HelloTwill.java:37) > >> > >> >>>>>>> Caused by: java.lang.reflect.InvocationTargetException > >> > >> >>>>>>> at > >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > >> > >> >>>>>>> Method) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > >> > >> >>>>>>> at > >> > java.lang.reflect.Constructor.newInstance(Constructor.java:534) > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > org.apache.twill.internal.yarn.VersionDetectYarnAppClientFactory.create(VersionDetectYarnAppClientFactory.java:44) > >> > >> >>>>>>> ... 3 more > >> > >> >>>>>>> Caused by: java.lang.Error: Unresolved compilation > problems: > >> > >> >>>>>>> The import > org.apache.hadoop.yarn.api.records.DelegationToken > >> > >> >>>>>>> cannot > >> > >> >>>>>>> be resolved > >> > >> >>>>>>> The import org.apache.hadoop.yarn.client.YarnClient cannot > be > >> > >> >>>>>>> resolved > >> > >> >>>>>>> The import org.apache.hadoop.yarn.client.YarnClientImpl > >> cannot > >> > be > >> > >> >>>>>>> resolved > >> > >> >>>>>>> The import > >> org.apache.hadoop.yarn.exceptions.YarnRemoteException > >> > >> >>>>>>> cannot be resolved > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClientImpl cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> The method setUser(String) is undefined for the type > >> > >> >>>>>>> ApplicationSubmissionContext > >> > >> >>>>>>> The method getUser() is undefined for the type > >> > >> >>>>>>> ApplicationSubmissionContext > >> > >> >>>>>>> The method setResource(Resource) is undefined for the type > >> > >> >>>>>>> ContainerLaunchContext > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnRemoteException cannot be resolved to a type > >> > >> >>>>>>> The method getMinimumResourceCapability() is undefined for > >> the > >> > >> >>>>>>> type > >> > >> >>>>>>> GetNewApplicationResponse > >> > >> >>>>>>> The method getContainerTokens() is undefined for the type > >> > >> >>>>>>> ContainerLaunchContext > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> The method setContainerTokens(ByteBuffer) is undefined for > >> the > >> > >> >>>>>>> type > >> > >> >>>>>>> ContainerLaunchContext > >> > >> >>>>>>> DelegationToken cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnRemoteException cannot be resolved to a type > >> > >> >>>>>>> YarnClient cannot be resolved to a type > >> > >> >>>>>>> YarnRemoteException cannot be resolved to a type > >> > >> >>>>>>> > >> > >> >>>>>>> at > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > > >> > org.apache.twill.internal.yarn.Hadoop20YarnAppClient.<init>(Hadoop20YarnAppClient.java:33) > >> > >> >>>>>>> ... 8 more > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> Could you please let me know if I am missing anything here > to > >> > >> >>>>>>> execute > >> > >> >>>>>>> this program. In my program, "localhost:2181" is hard > coded > >> for > >> > >> >>>>>>> zookeeper > >> > >> >>>>>>> string. > >> > >> >>>>>>> > >> > >> >>>>>>> My suspect: > >> > >> >>>>>>> - My setup is having hadoop-2.2.0 , to execute this > program, > >> do > >> > I > >> > >> >>>>>>> need to provide hadoop-2.0 libraries instead of 2.2 . > >> > >> >>>>>>> - Do I need to start zookeeper server separately ? > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> Thanks for your any help, > >> > >> >>>>>>> > >> > >> >>>>>>> Srini > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>> > >> > >> >>>>> > >> > >> >>>> > >> > >> >>>> > >> > >> >>> > >> > >> >> > >> > >> > > >> > > > >> > > > >> > > >> > > > > >
