I thought that I had, but perhaps not. I assume that I would configure this on the master, using the dfs.namenode.backup.address parameter. Is that correct? Thanks, Gabe
On Feb 1, 2012, at 4:27 PM, Jakob Homan wrote: >> Posted URL >> master:50070putimage=1&port=50090&machine=0.0.0.0&token=-31:1318804155:0:1328129935000:1328129628242 > Have you defined your secondary namenode address? The 2NN is telling > the NN to pull the merged image from http://0.0.0.0.0:50090. > > > On Wed, Feb 1, 2012 at 1:23 PM, Gabriel Rosendorf > <grosend...@e3smartenergy.com> wrote: >> No firewall. >> >> Here's hdfs-site.xml from 2NN: http://pastie.org/3298304 >> And from NN: http://pastie.org/3298309 >> >> On Feb 1, 2012, at 4:18 PM, Harsh J wrote: >> >>> Have you ensured there is no firewall between the two hosts? Can you >>> also pastebin your hdfs-site.xml? >>> >>> On Thu, Feb 2, 2012 at 2:45 AM, Gabriel Rosendorf >>> <grosend...@e3smartenergy.com> wrote: >>>> So I'm at a loss. Checkpointing is failing, and both hosts (NN and 2NN) >>>> are >>>> reachable via HTTP. >>>> Any ideas would be greatly appreciated! >>>> >>>> From NameNode: >>>> >>>> 2012-02-01 21:02:53,460 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* >>>> NameSystem.processReport: from 10.178.231.219:50010, blocks: 0, processing >>>> time: 0 msecs >>>> 2012-02-01 21:03:56,581 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from >>>> 10.178.224.109 >>>> 2012-02-01 21:03:56,685 WARN org.mortbay.log: /getimage: >>>> java.io.IOException: GetImage failed. java.net.ConnectException: Connection >>>> refused >>>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) >>>> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211) >>>> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) >>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) >>>> at java.net.Socket.connect(Socket.java:529) >>>> at java.net.Socket.connect(Socket.java:478) >>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:163) >>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:394) >>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:529) >>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:233) >>>> at sun.net.www.http.HttpClient.New(HttpClient.java:306) >>>> at sun.net.www.http.HttpClient.New(HttpClient.java:323) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:160) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1$1.run(GetImageServlet.java:88) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1$1.run(GetImageServlet.java:85) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:85) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:70) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:70) >>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) >>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) >>>> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) >>>> at >>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) >>>> at >>>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:816) >>>> at >>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) >>>> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) >>>> at >>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) >>>> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) >>>> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) >>>> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) >>>> at >>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) >>>> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) >>>> at org.mortbay.jetty.Server.handle(Server.java:326) >>>> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) >>>> at >>>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) >>>> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) >>>> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) >>>> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) >>>> at >>>> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) >>>> at >>>> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) >>>> >>>> 2012-02-01 21:08:56,691 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from >>>> 10.178.224.109 >>>> 2012-02-01 21:08:56,804 WARN org.mortbay.log: /getimage: >>>> java.io.IOException: GetImage failed. java.net.ConnectException: Connection >>>> refused >>>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) >>>> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211) >>>> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) >>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) >>>> at java.net.Socket.connect(Socket.java:529) >>>> at java.net.Socket.connect(Socket.java:478) >>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:163) >>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:394) >>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:529) >>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:233) >>>> at sun.net.www.http.HttpClient.New(HttpClient.java:306) >>>> at sun.net.www.http.HttpClient.New(HttpClient.java:323) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:160) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1$1.run(GetImageServlet.java:88) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1$1.run(GetImageServlet.java:85) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:85) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:70) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:70) >>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) >>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) >>>> at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) >>>> at >>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) >>>> at >>>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:816) >>>> at >>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) >>>> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) >>>> at >>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) >>>> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) >>>> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) >>>> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) >>>> at >>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) >>>> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) >>>> at org.mortbay.jetty.Server.handle(Server.java:326) >>>> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) >>>> at >>>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) >>>> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) >>>> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) >>>> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) >>>> at >>>> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) >>>> at >>>> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) >>>> >>>> >>>> From 2NN: >>>> >>>> 2012-02-01 21:03:58,634 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >>>> transactions: >>>> 0 Total time for transactions(ms): 0Number of transactions batched in >>>> Syncs: >>>> 0 Number of syncs: 0 SyncTimes(ms): 0 >>>> 2012-02-01 21:03:58,652 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded file >>>> fsimage size 112 bytes. >>>> 2012-02-01 21:03:58,654 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded file >>>> edits size 4 bytes. >>>> 2012-02-01 21:03:58,655 INFO org.apache.hadoop.hdfs.util.GSet: VM type >>>> = 64-bit >>>> 2012-02-01 21:03:58,655 INFO org.apache.hadoop.hdfs.util.GSet: 2% max >>>> memory >>>> = 17.77875 MB >>>> 2012-02-01 21:03:58,655 INFO org.apache.hadoop.hdfs.util.GSet: capacity >>>> = 2^21 = 2097152 entries >>>> 2012-02-01 21:03:58,655 INFO org.apache.hadoop.hdfs.util.GSet: >>>> recommended=2097152, actual=2097152 >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hduser >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> isPermissionEnabled=true >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> dfs.block.invalidate.limit=100 >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), >>>> accessTokenLifetime=0 min(s) >>>> 2012-02-01 21:03:58,658 INFO >>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names >>>> occuring >>>> more than 10 times >>>> 2012-02-01 21:03:58,700 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Number of files = 1 >>>> 2012-02-01 21:03:58,700 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Number of files under construction = 0 >>>> 2012-02-01 21:03:58,718 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Edits file /app/hadoop/tmp/dfs/namesecondary/current/edits of size 4 edits >>>> # >>>> 0 loaded in 0 seconds. >>>> 2012-02-01 21:03:58,719 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >>>> transactions: >>>> 0 Total time for transactions(ms): 0Number of transactions batched in >>>> Syncs: >>>> 0 Number of syncs: 0 SyncTimes(ms): 0 >>>> 2012-02-01 21:03:58,723 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Image file of size 112 saved in 0 seconds. >>>> 2012-02-01 21:03:58,733 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Image file of size 112 saved in 0 seconds. >>>> 2012-02-01 21:03:58,739 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL >>>> master:50070putimage=1&port=50090&machine=0.0.0.0&token=-31:1318804155:0:1328129935000:1328129628242 >>>> 2012-02-01 21:03:58,749 ERROR >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in >>>> doCheckpoint: >>>> 2012-02-01 21:03:58,749 ERROR >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: >>>> java.io.FileNotFoundException: >>>> http://master:50070/getimage?putimage=1&port=50090&machine=0.0.0.0&token=-31:1318804155:0:1328129935000:1328129628242 >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:160) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.putFSImage(SecondaryNameNode.java:377) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:418) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:312) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:275) >>>> at java.lang.Thread.run(Thread.java:662) >>>> >>>> 2012-02-01 21:08:58,754 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >>>> transactions: >>>> 0 Total time for transactions(ms): 0Number of transactions batched in >>>> Syncs: >>>> 0 Number of syncs: 0 SyncTimes(ms): 0 >>>> 2012-02-01 21:08:58,765 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded file >>>> fsimage size 112 bytes. >>>> 2012-02-01 21:08:58,772 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded file >>>> edits size 4 bytes. >>>> 2012-02-01 21:08:58,772 INFO org.apache.hadoop.hdfs.util.GSet: VM type >>>> = 64-bit >>>> 2012-02-01 21:08:58,772 INFO org.apache.hadoop.hdfs.util.GSet: 2% max >>>> memory >>>> = 17.77875 MB >>>> 2012-02-01 21:08:58,772 INFO org.apache.hadoop.hdfs.util.GSet: capacity >>>> = 2^21 = 2097152 entries >>>> 2012-02-01 21:08:58,772 INFO org.apache.hadoop.hdfs.util.GSet: >>>> recommended=2097152, actual=2097152 >>>> 2012-02-01 21:08:58,775 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hduser >>>> 2012-02-01 21:08:58,775 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup >>>> 2012-02-01 21:08:58,775 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> isPermissionEnabled=true >>>> 2012-02-01 21:08:58,776 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> dfs.block.invalidate.limit=100 >>>> 2012-02-01 21:08:58,776 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), >>>> accessTokenLifetime=0 min(s) >>>> 2012-02-01 21:08:58,776 INFO >>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names >>>> occuring >>>> more than 10 times >>>> 2012-02-01 21:08:58,776 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Number of files = 1 >>>> 2012-02-01 21:08:58,777 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Number of files under construction = 0 >>>> 2012-02-01 21:08:58,777 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Edits file /app/hadoop/tmp/dfs/namesecondary/current/edits of size 4 edits >>>> # >>>> 0 loaded in 0 seconds. >>>> 2012-02-01 21:08:58,777 INFO >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >>>> transactions: >>>> 0 Total time for transactions(ms): 0Number of transactions batched in >>>> Syncs: >>>> 0 Number of syncs: 0 SyncTimes(ms): 0 >>>> 2012-02-01 21:08:58,781 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Image file of size 112 saved in 0 seconds. >>>> 2012-02-01 21:08:58,789 INFO org.apache.hadoop.hdfs.server.common.Storage: >>>> Image file of size 112 saved in 0 seconds. >>>> 2012-02-01 21:08:58,838 INFO >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL >>>> master:50070putimage=1&port=50090&machine=0.0.0.0&token=-31:1318804155:0:1328129935000:1328129628242 >>>> 2012-02-01 21:08:58,872 ERROR >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in >>>> doCheckpoint: >>>> 2012-02-01 21:08:58,872 ERROR >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: >>>> java.io.FileNotFoundException: >>>> http://master:50070/getimage?putimage=1&port=50090&machine=0.0.0.0&token=-31:1318804155:0:1328129935000:1328129628242 >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:160) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.putFSImage(SecondaryNameNode.java:377) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:418) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:312) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:275) >>>> at java.lang.Thread.run(Thread.java:662) >>>> >>> >>> >>> >>> -- >>> Harsh J >>> Customer Ops. Engineer >>> Cloudera | http://tiny.cloudera.com/about >>