[jira] [Commented] (HDFS-2994) If lease is recovered successfully inline with create, create can fail
[ https://issues.apache.org/jira/browse/HDFS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257533#comment-13257533 ] Brahma Reddy Battula commented on HDFS-2994: Both scenario's after append,fail write(by renaming or restating DN) and then client should be closed.. like following..then append again. {code} DistributedFileSystem dfs = initHDFS(); try { writeFile(dfs, hdfsFile, out, 1,true); out=appendFile(dfs,hdfsFile); } writeFile(dfs, hdfsFile, out, 1,true); catch (Exception e) { // TODO: handle exception e.printStackTrace(); } finally { if (dfs != null) { dfs.close(); } } {code} If lease is recovered successfully inline with create, create can fail -- Key: HDFS-2994 URL: https://issues.apache.org/jira/browse/HDFS-2994 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0 Reporter: Todd Lipcon I saw the following logs on my test cluster: {code} 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile: recover lease [Lease. Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 from client DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All existing blocks are COMPLETE, lease removed, file closed. 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6 {code} It seems like, if {{recoverLeaseInternal}} succeeds in {{startFileInternal}}, then the INode will be replaced with a new one, meaning the later {{replaceNode}} call can fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2994) If lease is recovered successfully inline with create, create can fail
[ https://issues.apache.org/jira/browse/HDFS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256488#comment-13256488 ] Brahma Reddy Battula commented on HDFS-2994: Hi, I am able to reproduce same *sceanrio 1 Using debug point* Write a file /home/a.txt call Append to /home/a.txt. put a debugpoint in dfsclient at leaserenewer.put(src, result, this); when control come to above point just renamefile to /home/rename.txt Now again try to append to renamed file(/home/rename.txt)..then I am getting same exception {noformat} java.io.IOException: FSDirectory.replaceNode: failed to remove /home/rename.txt at org.apache.hadoop.hdfs.server.namenode.FSDirectory.replaceNode(FSDirectory.java:1119) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForWrite(FSNamesystem.java:1674) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1612) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1823) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:417) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:423) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891) {noformat} *Scenario 2:* == step 1:write a file /home/a.txt(size 2MB) step 2:call append on /home/a.txt(size 1.5MB) restart DN while second step inprogess multiple times. then I am getting same {noformat} java.io.IOException: FSDirectory.replaceNode: failed to remove /home/a.txt at org.apache.hadoop.hdfs.server.namenode.FSDirectory.replaceNode(FSDirectory.java:1119) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForWrite(FSNamesystem.java:1670) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1608) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1819) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:416) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657) at java.security.AccessController.doPrivileged(Native Method) {noformat} If lease is recovered successfully inline with create, create can fail -- Key: HDFS-2994 URL: https://issues.apache.org/jira/browse/HDFS-2994 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0 Reporter: Todd Lipcon I saw the following logs on my test cluster: {code} 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile: recover lease [Lease. Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 from client DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All existing blocks are COMPLETE, lease removed, file closed. 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6 {code} It seems like, if {{recoverLeaseInternal}} succeeds in {{startFileInternal}}, then the INode will be replaced with a new one, meaning the later {{replaceNode}} call can fail. -- This
[jira] [Commented] (HDFS-3123) BNN is getting Nullpointer execption and shuttingdown When NameNode got formatted
[ https://issues.apache.org/jira/browse/HDFS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234246#comment-13234246 ] Brahma Reddy Battula commented on HDFS-3123: Hi Uma, I taken patch from HDFS-2768(as you refered) then I am not getting java.lang.IllegalArgumentException: not a proxy instance but nullpointer still i am getting this you can look as part of this issue. BNN is getting Nullpointer execption and shuttingdown When NameNode got formatted -- Key: HDFS-3123 URL: https://issues.apache.org/jira/browse/HDFS-3123 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.4 Reporter: Brahma Reddy Battula Assignee: Uma Maheswara Rao G Scenario 1 == Start NN and BNN stop NN and BNN Format NN and start only BNN Then BNN as getting Nullpointer and getting shutdown {noformat} 12/03/20 21:26:05 ERROR ipc.RPC: Tried to call RPC.stopProxy on an object that is not a proxy. java.lang.IllegalArgumentException: not a proxy instance at java.lang.reflect.Proxy.getInvocationHandler(Proxy.java:637) at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:591) at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:194) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:547) at org.apache.hadoop.hdfs.server.namenode.BackupNode.init(BackupNode.java:86) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:847) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:908) 12/03/20 21:26:05 ERROR ipc.RPC: Could not get invocation handler null for proxy class class org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB, or invocation handler is not closeable. 12/03/20 21:26:05 ERROR namenode.NameNode: Exception in namenode join java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:205) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:547) at org.apache.hadoop.hdfs.server.namenode.BackupNode.init(BackupNode.java:86) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:847) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:908) 12/03/20 21:26:05 INFO namenode.NameNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at HOST-10-18-40-233/10.18.40.233 / {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3108) [UI] Few Namenode links are not working
[ https://issues.apache.org/jira/browse/HDFS-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233637#comment-13233637 ] Brahma Reddy Battula commented on HDFS-3108: Ya..Scenario-1 is same as HDFS-2025.. [UI] Few Namenode links are not working --- Key: HDFS-3108 URL: https://issues.apache.org/jira/browse/HDFS-3108 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0, 0.23.1 Reporter: Brahma Reddy Battula Priority: Minor Fix For: 0.23.3 Attachments: Scenario2_Trace.txt Scenario 1 == Once tail a file from UI and click on Go Back to File View,I am getting HTTP ERROR 404 Scenario 2 === Frequently I am getting following execption If a click on (BrowseFileSystem or anyfile)java.lang.IllegalArgumentException: java.net.UnknownHostException: HOST-10-18-40-24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3108) [UI] Few Namenode links are not working
[ https://issues.apache.org/jira/browse/HDFS-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232450#comment-13232450 ] Brahma Reddy Battula commented on HDFS-3108: Attached execption trace for scenario-2 [UI] Few Namenode links are not working --- Key: HDFS-3108 URL: https://issues.apache.org/jira/browse/HDFS-3108 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0, 0.23.1 Reporter: Brahma Reddy Battula Priority: Minor Fix For: 0.23.3 Attachments: Scenario2_Trace.txt Scenario 1 == Once tail a file from UI and click on Go Back to File View,I am getting HTTP ERROR 404 Scenario 2 === Frequently I am getting following execption If a click on (BrowseFileSystem or anyfile)java.lang.IllegalArgumentException: java.net.UnknownHostException: HOST-10-18-40-24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2892) Some of property descriptions are not given(hdfs-default.xml)
[ https://issues.apache.org/jira/browse/HDFS-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232459#comment-13232459 ] Brahma Reddy Battula commented on HDFS-2892: Hi Uma, I gone thru HDFS-273,Thanks for your reply.. Hope remaining quries(Property descriptions and commented property) will be addressed Some of property descriptions are not given(hdfs-default.xml) -- Key: HDFS-2892 URL: https://issues.apache.org/jira/browse/HDFS-2892 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Brahma Reddy Battula Priority: Trivial Hi..I taken 23.0 release form http://hadoop.apache.org/common/releases.html#11+Nov%2C+2011%3A+release+0.23.0+available I just gone through all properties provided in the hdfs-default.xml..Some of the property description not mentioned..It's better to give description of property and usage(how to configure ) and Only MapReduce related jars only provided..Please check following two configurations *No Description* {noformat} property namedfs.datanode.https.address/name value0.0.0.0:50475/value /property property namedfs.namenode.https-address/name value0.0.0.0:50470/value /property {noformat} Better to mention example usage (what to configure...format(syntax))in desc,here I did not get what default mean whether this name of n/w interface or something else property namedfs.datanode.dns.interface/name valuedefault/value descriptionThe name of the Network Interface from which a data node should report its IP address. /description /property The following property is commented..If it is not supported better to remove. property namedfs.cluster.administrators/name valueACL for the admins/value descriptionThis configuration is used to control who can access the default servlets in the namenode, etc. /description /property Small clarification for following property..if some value configured this then NN will be safe mode upto this much time.. May I know usage of the following property... property namedfs.blockreport.initialDelay/name value0/value descriptionDelay for first block report in seconds./description /property -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2940) HA: NullPointerException while formatting NameNode(After Configuring HA)
[ https://issues.apache.org/jira/browse/HDFS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206489#comment-13206489 ] Brahma Reddy Battula commented on HDFS-2940: Please help me if I am wrong..I am notable proceed further... HA: NullPointerException while formatting NameNode(After Configuring HA) Key: HDFS-2940 URL: https://issues.apache.org/jira/browse/HDFS-2940 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: HA branch (HDFS-1623) Reporter: Brahma Reddy Battula Assignee: Uma Maheswara Rao G Priority: Minor Attachments: core-site.xml, hdfs-site.xml Scenario: = Step 1: I configured all Ha related configurations Step 2 : I formatted NN using (./hdfs namenode -format) format command Please check following Trace: {noformat} Formatting using clusterid: CID-26a49fe8-abed-4d80-b02d-eeceb86dbd53 12/02/12 20:57:46 ERROR namenode.NameNode: Exception in namenode join java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.isLocalAddress(NetUtils.java:640) at org.apache.hadoop.hdfs.DFSUtil$4.match(DFSUtil.java:115) at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:934) at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:890) at org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:850) at org.apache.hadoop.hdfs.server.namenode.FSImage.init(FSImage.java:129) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:689) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:817) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:894) 12/02/12 20:57:46 INFO namenode.NameNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at HOST-10-18-40-20/10.18.40.20 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2906) NullPointerExeception in BlockReceiver on DataNode restart.
[ https://issues.apache.org/jira/browse/HDFS-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202168#comment-13202168 ] Brahma Reddy Battula commented on HDFS-2906: HI Uma, Thanks for analysis.Seems to same happening. It's always able reprodcue by executing mentioned scenario.. NullPointerExeception in BlockReceiver on DataNode restart. --- Key: HDFS-2906 URL: https://issues.apache.org/jira/browse/HDFS-2906 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0 Reporter: Brahma Reddy Battula Assignee: Uma Maheswara Rao G Scenario: 1) Start the cluster with one NN and two DN's. 2) Keep restart one DN for every 10mins. 3) Let's keep the clients to write the data continuosly. Observed below trace. {noformat} 2012-02-07 01:03:45,897 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: HOST-10-18-40-23:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.18.40.20:23862 dest: /10.18.40.23:50010 java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:151) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:340) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:167) at java.lang.Thread.run(Thread.java:619) 2012-02-07 01:03:46,083 INFO org.apache.hadoop.hdfs.server.common.Storage: Locking is disabled {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2892) Some of property descriptions are not given(hdfs-default.xml)
[ https://issues.apache.org/jira/browse/HDFS-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13200700#comment-13200700 ] Brahma Reddy Battula commented on HDFS-2892: Hi Uma, Thanks for your reply and suggestions... ~This property is to send the block report when starting up the system. If you mention this delay more, then initial block report will take tame(delay you configured)..~ Ya..I knew this property for sending block report when DN is started...My query is If initial-delay is configured as say 5 mins upto this time NameNode will be in safemode hence it's wn't get block report from DN's.If continuous restarting DN's then NN will go to safemode all the time...So what my doubt is why this initial delay is required.?. Please correct me If I am Wrong here.. since descriptions of two properties missed and one property is commented,I raised as suggestion.. . Some of property descriptions are not given(hdfs-default.xml) -- Key: HDFS-2892 URL: https://issues.apache.org/jira/browse/HDFS-2892 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Brahma Reddy Battula Priority: Trivial Hi..I taken 23.0 release form http://hadoop.apache.org/common/releases.html#11+Nov%2C+2011%3A+release+0.23.0+available I just gone through all properties provided in the hdfs-default.xml..Some of the property description not mentioned..It's better to give description of property and usage(how to configure ) and Only MapReduce related jars only provided..Please check following two configurations *No Description* {noformat} property namedfs.datanode.https.address/name value0.0.0.0:50475/value /property property namedfs.namenode.https-address/name value0.0.0.0:50470/value /property {noformat} Better to mention example usage (what to configure...format(syntax))in desc,here I did not get what default mean whether this name of n/w interface or something else property namedfs.datanode.dns.interface/name valuedefault/value descriptionThe name of the Network Interface from which a data node should report its IP address. /description /property The following property is commented..If it is not supported better to remove. property namedfs.cluster.administrators/name valueACL for the admins/value descriptionThis configuration is used to control who can access the default servlets in the namenode, etc. /description /property Small clarification for following property..if some value configured this then NN will be safe mode upto this much time.. May I know usage of the following property... property namedfs.blockreport.initialDelay/name value0/value descriptionDelay for first block report in seconds./description /property -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira