[ https://issues.apache.org/jira/browse/SLIDER-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sandeep Nemuri updated SLIDER-1169: ----------------------------------- Description: *PROBLEM* : Customer has created a Slider App by passing zookeeper quorum using below command : {code} slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 {code} Below is the application log, which show us that it only picks the 1st zookeeper. {code} 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030 {code} *BUSINESS IMPACT* : Slider throws exceptions when 1st zookeeper goes down (Since it only picks 1st zookeeper) and this is impacting the AM. *STEPS TO REPRODUCE*: Launch a Hbase app using step 1 & 2. 1) slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 This will launch a application in RM. >From the RM UI --> application -> logs first line will be as below : {code} 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030 {code} was: *PROBLEM* : Customer has created a Slider App by passing zookeeper quorum using below command : {code} slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 {code} Below is the application log, which show us that it only picks the 1st zookeeper. {code} 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030 {code} *BUSINESS IMPACT* : Slider throws exceptions when 1st zookeeper goes down (Since it only picks 1st zookeeper) and this is impacting customers production. *STEPS TO REPRODUCE*: Launch a Hbase app using step 1 & 2. 1) slider create test --template appConfig.json --resources resources.json --zkhosts sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 This will launch a application in RM. >From the RM UI --> application -> logs first line will be as below : {code} 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading slider-server.xml at file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM configuration: hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 hadoop.registry.zk.root=/registry yarn.resourcemanager.scheduler.address=0.0.0.0:8030 {code} > Slider not honoring zookeeper quorum values passed > -------------------------------------------------- > > Key: SLIDER-1169 > URL: https://issues.apache.org/jira/browse/SLIDER-1169 > Project: Slider > Issue Type: Bug > Components: appmaster > Affects Versions: Slider 0.91 > Environment: RHEL-6 (64 Bit) > Reporter: Sandeep Nemuri > Priority: Critical > > *PROBLEM* : > Customer has created a Slider App by passing zookeeper quorum using below > command : > {code} > slider create test --template appConfig.json --resources resources.json > --zkhosts > sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 > {code} > Below is the application log, which show us that it only picks the 1st > zookeeper. > {code} > 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading > slider-server.xml at > file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml > 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM > configuration: > hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 > hadoop.registry.zk.root=/registry > yarn.resourcemanager.scheduler.address=0.0.0.0:8030 > {code} > *BUSINESS IMPACT* : Slider throws exceptions when 1st zookeeper goes down > (Since it only picks 1st zookeeper) and this is impacting the AM. > *STEPS TO REPRODUCE*: > Launch a Hbase app using step 1 & 2. > 1) slider create test --template appConfig.json --resources resources.json > --zkhosts > sandy234new1.hwxblr.com:2181,sandy234new3.hwxblr.com:2181,sandy234new2.hwxblr.com:2181 > This will launch a application in RM. > From the RM UI --> application -> logs > first line will be as below : > {code} > 2016-09-15 15:44:29,052 [main] INFO appmaster.SliderAppMaster - Loading > slider-server.xml at > file:/hadoop/yarn/local/usercache/root/appcache/application_1473930641993_0005/container_e04_1473930641993_0005_01_000001/confdir/slider-server.xml > 2016-09-15 15:44:29,077 [main] INFO appmaster.SliderAppMaster - AM > configuration: > hadoop.registry.zk.quorum=sandy234new1.hwxblr.com:2181 > hadoop.registry.zk.root=/registry > yarn.resourcemanager.scheduler.address=0.0.0.0:8030 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)