I figured out the solution. I restarted manually the Active NN and standby NN with this command. /usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode -rollingUpgrade started
Once it started successfully from Ambari UI , I ran the finalize and it completed successfully. On Thu, May 19, 2016 at 9:15 PM, Anandha L Ranganathan < [email protected]> wrote: > Hi Jonathan > > I tried to run it using ambari UI but it failed to with this exception. > Is there a way I can restart manually from command line. > > Our cluster are enabled with Namenode HA . > > > 2016-05-20 04:01:36,027 - > Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs > hdfs://dfs-nameservices -rollingUpgrade query'] {'logoutput': True, 'user': > 'hdfs'} > QUERY rolling upgrade ... > 16/05/20 04:01:38 INFO retry.RetryInvocationHandler: Exception while invoking > rollingUpgrade of class ClientNamenodeProtocolTranslatorPB over > usw2dxdpma02.local/172.17.212.157:8020 after 1 fail over attempts. Trying to > fail over after sleeping for 568ms. > java.net.ConnectException: Call From > usw2dxdpma02.glassdoor.local/172.17.212.157 to usw2dxdpma02.local:8020 failed > on connection exception: java.net.ConnectException: Connection refused; For > more details see: http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1431) > at org.apache.hadoop.ipc.Client.call(Client.java:1358) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy9.rollingUpgrade(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rollingUpgrade(ClientNamenodeProtocolTranslatorPB.java:728) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy10.rollingUpgrade(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.rollingUpgrade(DFSClient.java:2956) > at > org.apache.hadoop.hdfs.DistributedFileSystem.rollingUpgrade(DistributedFileSystem.java:1287) > at > org.apache.hadoop.hdfs.tools.DFSAdmin$RollingUpgradeCommand.run(DFSAdmin.java:373) > at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1815) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1973) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:612) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:710) > at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493) > at org.apache.hadoop.ipc.Client.call(Client.java:1397) > ... 18 more > > > > > On Thu, May 19, 2016 at 5:08 AM, Jonathan Hurley <[email protected]> > wrote: > >> You hitting an instance of >> https://issues.apache.org/jira/browse/AMBARI-15482 >> >> I don't know of a way around this aside from: >> - Finalizing the upgrade >> - Starting NameNode manually from the command prompt >> >> It's probably best to just finalize the upgrade and start NameNode from >> the web client after finalization. >> >> On May 18, 2016, at 10:02 PM, Anandha L Ranganathan < >> [email protected]> wrote: >> >> I am running rolling upgrade in dev cluster . It is completed with >> 100% but not yet finalized. >> I was testing int he dev cluster and validating is everything working >> fine. I was able to run the Hive Query using HS2 server. >> >> I don't remember what is the reason but I restarted all namenode services >> through Ambari UI and I started getting this error. It says run with >> --upgrade option. I thorugh rolling upgrade would take care of it. Please >> help me how do I handle this ? What are the steps should I do ? >> >> >> 2016-05-19 01:42:38,561 INFO util.GSet >> (LightWeightGSet.java:computeCapacity(356)) - 0.029999999329447746% max >> memory 1011.3 MB = 310.7 KB >> 2016-05-19 01:42:38,561 INFO util.GSet >> (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^15 = 32768 >> entries >> 2016-05-19 01:42:38,579 INFO common.Storage (Storage.java:tryLock(715)) >> - Lock on /mnt/data/hadoop/hdfs/namenode/in_use.lock acquired by nodename >> [email protected] >> 2016-05-19 01:42:38,651 WARN namenode.FSNamesystem >> (FSNamesystem.java:loadFromDisk(690)) - Encountered exception loading >> fsimage >> java.io.IOException: >> File system image contains an old layout version -60. >> An upgrade to version -63 is required. >> Please restart NameNode with the "-rollingUpgrade started" option if a >> rolling upgrade is already started; or restart NameNode with the "-upgrade" >> option to start a new upgrade. >> at >> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:245) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:722) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707) >> 2016-05-19 01:42:38,661 INFO mortbay.log (Slf4jLog.java:info(67)) - >> Stopped >> HttpServer2$SelectChannelConnectorWithSafeStartup@usw2dxdpma01.glassdoor.local >> :50070 >> 2016-05-19 01:42:38,663 INFO impl.MetricsSystemImpl >> (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system... >> 2016-05-19 01:42:38,664 INFO impl.MetricsSinkAdapter >> (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread >> interrupted. >> 2016-05-19 01:42:38,664 INFO impl.MetricsSystemImpl >> (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped. >> 2016-05-19 01:42:38,664 INFO impl.MetricsSystemImpl >> (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdown >> complete. >> 2016-05-19 01:42:38,665 ERROR namenode.NameNode >> (NameNode.java:main(1712)) - Failed to start namenode. >> >> >> >
