> On March 15, 2017, 5:21 p.m., Nate Cole wrote: > > ambari-server/src/main/resources/stacks/HDP/2.5/upgrades/upgrade-2.6.xml > > Lines 219-222 (original), 219-222 (patched) > > <https://reviews.apache.org/r/57604/diff/1/?file=1663928#file1663928line219> > > > > I thought the class was in 2.5 such that this wouldn't be the same > > issue. Did your tests show otherwise?
Missing class was added at 2.6 - Dmitro ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/57604/#review169013 ----------------------------------------------------------- On March 14, 2017, 7:09 p.m., Dmitro Lisnichenko wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/57604/ > ----------------------------------------------------------- > > (Updated March 14, 2017, 7:09 p.m.) > > > Review request for Ambari, Jonathan Hurley, Nate Cole, and Vinod Kumar > Vavilapalli. > > > Bugs: AMBARI-20447 > https://issues.apache.org/jira/browse/AMBARI-20447 > > > Repository: ambari > > > Description > ------- > > The problem with YARN service check failure is that during Rolling upgrade > from HDP-2.4 to HDP-2.6 (with YARN HA turned on): > # After "core master restart" step, yarn client uses new (HDP-2.6) config and > fails with Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > . Forcing yarn client to use old (HDP-2.4) config until client binary is > updated helps here > # After "core slave restart" step, using old YARN client config with old YARN > client binary does not help. NM/RM classpath points to HDP-2.6. App job gets > scheduled, but then fails with log: > > {code}17/03/06 16:39:27 INFO service.AbstractService: Service > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl failed in state > STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2240) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:160) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93) > at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.serviceStart(AMRMClientImpl.java:186) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.serviceStart(AMRMClientAsyncImpl.java:96) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:559) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:299) > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not > found > at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2208) > at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2232) > ... 9 more > Caused by: java.lang.ClassNotFoundException: Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114) > at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2206) > ... 10 more > 17/03/06 16:39:27 INFO service.AbstractService: Service > org.apache.hadoop.yarn.client.api.async.AMRMClientAsync failed in state > STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found > at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2240) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:160) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93) > at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > {code} > # After yarn client is updated to a new binary, service check works fine. > ---- > > Bottom line, this is a known problem with DistributedShell - it was never > fixed to not rely on cluster's configuration. What this means is that client > configuration changes like this can break DistributedShell apps over upgrades. > Unfortunately nothing we do now can fix this broken upgrade for > DistributedShell - as to ideally fix it, we have to go back in time and > provide changes. > > We have to do two things > # Disable DistributedShell based service-check when we go from 2.4 > 2.6. The > RequestHedgingRMFailoverProxyProvider is added in 2.5, so 2.5 > 2.6 is fine. > # Also fix yarn-site.xml starting 2.6 with the following change to avoid this > in the future. The change is from using $HADOOP_CONF_DIR which is inherited > from the NodeManager to /etc/hadoop/conf/ which is always tied to the client > version. > {code} > <property> > <name>yarn.application.classpath</name> > <value>/etc/hadoop/conf/,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value> > </property> > {code} > > > Diffs > ----- > > ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/upgrade-2.6.xml > c27b634efd > ambari-server/src/main/resources/stacks/HDP/2.4/upgrades/upgrade-2.6.xml > dc92c2b46f > ambari-server/src/main/resources/stacks/HDP/2.5/upgrades/upgrade-2.6.xml > ab6b2398b6 > > ambari-server/src/main/resources/stacks/HDP/2.6/services/YARN/configuration/yarn-site.xml > 4b97148278 > > > Diff: https://reviews.apache.org/r/57604/diff/1/ > > > Testing > ------- > > checked that upgrade 2.4->2.6 passes well. > > First my thought was that there is not need to skip YARN service check after > slave restart (since Yarn 2.6 configuration is expected to be correct). But > that is not the case, so I excluded YARN service check on this step. > > mvn clean test > > > Thanks, > > Dmitro Lisnichenko > >