Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Gour Saha
Ok, so this node is not a gateway. It is part of the cluster, which means you don¹t need slider-client.xml at all. Just have HADOOP_CONF_DIR pointing to /etc/hadoop/conf in slider-env.sh and that should be it. So the above simplifies your config setup. It will not solve either of the 2 problems yo

Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Manoj Samel
1. Not clear about your question on "gateway" node. The node running slider is part of the hadoop cluster and there are other services like Oozie that run on this node that utilizes hdfs and yarn. So if your question is whether the node is otherwise working for HDFS and Yarn configur

Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Gour Saha
The node you are running slider from, is that a gateway node? Sorry for not being explicit. I meant copy everything under /etc/hadoop/conf from your cluster into some temp directory (say /tmp/hadoop_conf) in your gateway node or local or whichever node you are running slider from. Then set HADOOP_C

Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Manoj Samel
Hi Gour, Thanks for your prompt reply. FYI, issue happens when I create slider app when rm1 is active and when rm1 fails over to rm2. As soon as rm2 becomes active; the slider AM goes from RUNNING to ACCEPTED state with above error. For your suggestion, I did following 1) Copied core-site, hdfs

Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Gour Saha
If possible, can you copy the entire content of the directory /etc/hadoop/conf and then set HADOOP_CONF_DIR in slider-env.sh to it. Keep slider-client.xml empty. Now when you do the same rm1->rm2 and then the reverse failovers, do you see the same behaviors? -Gour On 7/25/16, 2:28 PM, "Manoj Sam

Slider-develop - Build # 838 - Aborted

2016-07-25 Thread Apache Jenkins Server
The Apache Jenkins build system has built Slider-develop (build #838) Status: Aborted Check console output at https://builds.apache.org/job/Slider-develop/838/ to view the results.

What RM properties are must in slider-client.xml, if present in files in HADOOP_CONF_DIR ?

2016-07-25 Thread Manoj Samel
Hello, Slider version is 0.80, Hadoop is 2.6 with Kerberos Slider-client.xml allows specification of full path of hadoop conf using HADOOP_CONF_DIR. In our case, full hadoop configuration, including all HA configurations are available in the HADOOP_CONF_DIR for hdfs-site, core-site and yarn-site.

Re: Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Manoj Samel
Another observation (whatever it is worth) If slider app is created and started when rm2 was active, then it seems to survive switches between rm2 and rm1 (and back). I.e * rm2 is active * create and start slider application * fail over to rm1. Now the Slider AM keeps running * fail over to rm2 a

Slider-develop - Build # 837 - Aborted

2016-07-25 Thread Apache Jenkins Server
The Apache Jenkins build system has built Slider-develop (build #837) Status: Aborted Check console output at https://builds.apache.org/job/Slider-develop/837/ to view the results.

Slider AM fails to run when RM in HA setup fails over

2016-07-25 Thread Manoj Samel
Setup - Hadoop 2.6 with RM HA, Kerberos enabled - Slider 0.80 - In my slider-client.xml, I have added all RM HA properties, including the ones mentioned in http://markmail.org/message/wnhpp2zn6ixo65e3. Following is the issue * rm1 is active, rm2 is standby * deploy and start slider application,

[jira] [Comment Edited] (SLIDER-1153) Code issues - 14 null pointer deferences found

2016-07-25 Thread Gour Saha (JIRA)
[ https://issues.apache.org/jira/browse/SLIDER-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392340#comment-15392340 ] Gour Saha edited comment on SLIDER-1153 at 7/25/16 5:47 PM: [

[jira] [Commented] (SLIDER-1153) Code issues - 14 null pointer deferences found

2016-07-25 Thread Gour Saha (JIRA)
[ https://issues.apache.org/jira/browse/SLIDER-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392340#comment-15392340 ] Gour Saha commented on SLIDER-1153: --- [~jianhe] that you for the .3 patch. It looks good

[jira] [Commented] (SLIDER-1153) Code issues - 14 null pointer deferences found

2016-07-25 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/SLIDER-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392343#comment-15392343 ] ASF subversion and git services commented on SLIDER-1153: - Commit

[jira] [Updated] (SLIDER-1153) Code issues - 14 null pointer deferences found

2016-07-25 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/SLIDER-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated SLIDER-1153: Attachment: SLIDER-1153.3.patch [~gsaha], I uploaded v3 patch which addressed last set of issues. > Code i