[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259394#comment-13259394 ] dhruba borthakur commented on HDFS-3092: I am trying to digest the meat of this approach, but one question that I do not have an answer for: is it possible for the journal daemon to write data to disks and nodes that do not share load from other non-journal writers? I feel this requirement will be critical to ensure low variance of write latencies for the journal. My experience is that a 5% increase in the latency of writes to the transaction log causes a 20% degradation of namenode throughput in a large cluster. Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: ComparisonofApproachesforHAJournals.pdf, MultipleSharedJournals.pdf, MultipleSharedJournals.pdf, MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3307) when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码)
[ https://issues.apache.org/jira/browse/HDFS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259396#comment-13259396 ] yixiaohua commented on HDFS-3307: - dear todd: 乱码 is not the string that causes the problem,it is chinese I don't how do describe, I has place the string that causes the problem and the test code in attachments . wish for your reply, best wishes! when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码) - Key: HDFS-3307 URL: https://issues.apache.org/jira/browse/HDFS-3307 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1 Environment: SUSE LINUX Reporter: yixiaohua Attachments: FSImage.java, ProblemString.txt, TestUTF8AndStringGetBytes.java Original Estimate: 12h Remaining Estimate: 12h this the log information of the exception from the SecondaryNameNode: 2012-03-28 00:48:42,553 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Found lease for non-existent file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@??? ??tor.qzone.qq.com/keypart-00174 at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225) at java.lang.Thread.run(Thread.java:619) this is the log information about the file from namenode: 2012-03-28 00:32:26,528 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=create src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=null perm=boss:boss:rw-r--r-- 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174. blk_2751836614265659170_184668759 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 is closed by DFSClient_attempt_201203271849_0016_r_000174_0 2012-03-28 00:37:50,315 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=rename src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=/user/boss/pgv/fission/task16/split/ @? tor.qzone.qq.com/keypart-00174 perm=boss:boss:rw-r--r-- after check the code that save FSImage,I found there are a problem that maybe a bug of HDFS Code,I past below: -this is the saveFSImage method in FSImage.java, I make some mark at the problem code /** * Save the contents of the FS image to the file. */ void saveFSImage(File newFile) throws IOException { FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem(); FSDirectory fsDir = fsNamesys.dir; long startTime = FSNamesystem.now(); // // Write out data // DataOutputStream out = new DataOutputStream( new BufferedOutputStream( new FileOutputStream(newFile))); try { . // save the rest of the nodes saveImage(strbuf, 0, fsDir.rootDir, out);--problem fsNamesys.saveFilesUnderConstruction(out);--problem detail is below strbuf = null; } finally { out.close(); } LOG.info(Image file of size + newFile.length() + saved in + (FSNamesystem.now() - startTime)/1000 + seconds.); } /** * Save file tree image starting from the
[jira] [Updated] (HDFS-3307) when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码)
[ https://issues.apache.org/jira/browse/HDFS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yixiaohua updated HDFS-3307: Attachment: ProblemString.txt TestUTF8AndStringGetBytes.java when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码) - Key: HDFS-3307 URL: https://issues.apache.org/jira/browse/HDFS-3307 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1 Environment: SUSE LINUX Reporter: yixiaohua Attachments: FSImage.java, ProblemString.txt, TestUTF8AndStringGetBytes.java Original Estimate: 12h Remaining Estimate: 12h this the log information of the exception from the SecondaryNameNode: 2012-03-28 00:48:42,553 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Found lease for non-existent file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@??? ??tor.qzone.qq.com/keypart-00174 at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225) at java.lang.Thread.run(Thread.java:619) this is the log information about the file from namenode: 2012-03-28 00:32:26,528 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=create src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=null perm=boss:boss:rw-r--r-- 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174. blk_2751836614265659170_184668759 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 is closed by DFSClient_attempt_201203271849_0016_r_000174_0 2012-03-28 00:37:50,315 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=rename src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=/user/boss/pgv/fission/task16/split/ @? tor.qzone.qq.com/keypart-00174 perm=boss:boss:rw-r--r-- after check the code that save FSImage,I found there are a problem that maybe a bug of HDFS Code,I past below: -this is the saveFSImage method in FSImage.java, I make some mark at the problem code /** * Save the contents of the FS image to the file. */ void saveFSImage(File newFile) throws IOException { FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem(); FSDirectory fsDir = fsNamesys.dir; long startTime = FSNamesystem.now(); // // Write out data // DataOutputStream out = new DataOutputStream( new BufferedOutputStream( new FileOutputStream(newFile))); try { . // save the rest of the nodes saveImage(strbuf, 0, fsDir.rootDir, out);--problem fsNamesys.saveFilesUnderConstruction(out);--problem detail is below strbuf = null; } finally { out.close(); } LOG.info(Image file of size + newFile.length() + saved in + (FSNamesystem.now() - startTime)/1000 + seconds.); } /** * Save file tree image starting from the given root. * This is a recursive procedure, which first saves all children of * a current directory and then moves inside the sub-directories. */ private static void saveImage(ByteBuffer
[jira] [Updated] (HDFS-3307) when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码)
[ https://issues.apache.org/jira/browse/HDFS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yixiaohua updated HDFS-3307: Attachment: TestUTF8AndStringGetBytes.java when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码) - Key: HDFS-3307 URL: https://issues.apache.org/jira/browse/HDFS-3307 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1 Environment: SUSE LINUX Reporter: yixiaohua Attachments: FSImage.java, ProblemString.txt, TestUTF8AndStringGetBytes.java, TestUTF8AndStringGetBytes.java Original Estimate: 12h Remaining Estimate: 12h this the log information of the exception from the SecondaryNameNode: 2012-03-28 00:48:42,553 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Found lease for non-existent file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@??? ??tor.qzone.qq.com/keypart-00174 at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225) at java.lang.Thread.run(Thread.java:619) this is the log information about the file from namenode: 2012-03-28 00:32:26,528 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=create src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=null perm=boss:boss:rw-r--r-- 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174. blk_2751836614265659170_184668759 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 is closed by DFSClient_attempt_201203271849_0016_r_000174_0 2012-03-28 00:37:50,315 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=rename src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=/user/boss/pgv/fission/task16/split/ @? tor.qzone.qq.com/keypart-00174 perm=boss:boss:rw-r--r-- after check the code that save FSImage,I found there are a problem that maybe a bug of HDFS Code,I past below: -this is the saveFSImage method in FSImage.java, I make some mark at the problem code /** * Save the contents of the FS image to the file. */ void saveFSImage(File newFile) throws IOException { FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem(); FSDirectory fsDir = fsNamesys.dir; long startTime = FSNamesystem.now(); // // Write out data // DataOutputStream out = new DataOutputStream( new BufferedOutputStream( new FileOutputStream(newFile))); try { . // save the rest of the nodes saveImage(strbuf, 0, fsDir.rootDir, out);--problem fsNamesys.saveFilesUnderConstruction(out);--problem detail is below strbuf = null; } finally { out.close(); } LOG.info(Image file of size + newFile.length() + saved in + (FSNamesystem.now() - startTime)/1000 + seconds.); } /** * Save file tree image starting from the given root. * This is a recursive procedure, which first saves all children of * a current directory and then moves inside the sub-directories. */ private static void saveImage(ByteBuffer
[jira] [Commented] (HDFS-3307) when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码)
[ https://issues.apache.org/jira/browse/HDFS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259400#comment-13259400 ] yixiaohua commented on HDFS-3307: - I am try to figure out the problem of UTF8 ~_~ when save FSImage ,HDFS( or SecondaryNameNode or FSImage)can't handle some file whose file name has some special messy code(乱码) - Key: HDFS-3307 URL: https://issues.apache.org/jira/browse/HDFS-3307 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1 Environment: SUSE LINUX Reporter: yixiaohua Attachments: FSImage.java, ProblemString.txt, TestUTF8AndStringGetBytes.java, TestUTF8AndStringGetBytes.java Original Estimate: 12h Remaining Estimate: 12h this the log information of the exception from the SecondaryNameNode: 2012-03-28 00:48:42,553 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Found lease for non-existent file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@??? ??tor.qzone.qq.com/keypart-00174 at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225) at java.lang.Thread.run(Thread.java:619) this is the log information about the file from namenode: 2012-03-28 00:32:26,528 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=create src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=null perm=boss:boss:rw-r--r-- 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174. blk_2751836614265659170_184668759 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 is closed by DFSClient_attempt_201203271849_0016_r_000174_0 2012-03-28 00:37:50,315 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss ip=/10.131.16.34cmd=rename src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/ @?tor.qzone.qq.com/keypart-00174 dst=/user/boss/pgv/fission/task16/split/ @? tor.qzone.qq.com/keypart-00174 perm=boss:boss:rw-r--r-- after check the code that save FSImage,I found there are a problem that maybe a bug of HDFS Code,I past below: -this is the saveFSImage method in FSImage.java, I make some mark at the problem code /** * Save the contents of the FS image to the file. */ void saveFSImage(File newFile) throws IOException { FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem(); FSDirectory fsDir = fsNamesys.dir; long startTime = FSNamesystem.now(); // // Write out data // DataOutputStream out = new DataOutputStream( new BufferedOutputStream( new FileOutputStream(newFile))); try { . // save the rest of the nodes saveImage(strbuf, 0, fsDir.rootDir, out);--problem fsNamesys.saveFilesUnderConstruction(out);--problem detail is below strbuf = null; } finally { out.close(); } LOG.info(Image file of size + newFile.length() + saved in + (FSNamesystem.now() - startTime)/1000 + seconds.); } /** * Save file tree image starting from the given root. * This is a recursive procedure, which first saves all children of * a current directory and then moves inside the