[jira] [Commented] (HDFS-3273) Refactor BackupImage and FSEditLog
[ https://issues.apache.org/jira/browse/HDFS-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253869#comment-13253869 ] Suresh Srinivas commented on HDFS-3273: --- +1 for the patch. Refactor BackupImage and FSEditLog -- Key: HDFS-3273 URL: https://issues.apache.org/jira/browse/HDFS-3273 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3273_20120413.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-119) logSync() may block NameNode forever.
[ https://issues.apache.org/jira/browse/HDFS-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253022#comment-13253022 ] Suresh Srinivas commented on HDFS-119: -- +1 for the patch. logSync() may block NameNode forever. - Key: HDFS-119 URL: https://issues.apache.org/jira/browse/HDFS-119 Project: Hadoop HDFS Issue Type: Bug Reporter: Konstantin Shvachko Assignee: Suresh Srinivas Fix For: 0.21.0 Attachments: HDFS-119-branch-1.0.patch, HDFS-119.patch, HDFS-119.patch # {{FSEditLog.logSync()}} first waits until {{isSyncRunning}} is false and then performs syncing to file streams by calling {{EditLogOutputStream.flush()}}. If an exception is thrown after {{isSyncRunning}} is set to {{true}} all threads will always wait on this condition. An {{IOException}} may be thrown by {{EditLogOutputStream.setReadyToFlush()}} or a {{RuntimeException}} may be thrown by {{EditLogOutputStream.flush()}} or by {{processIOError()}}. # The loop that calls {{eStream.flush()}} for multiple {{EditLogOutputStream}}-s is not synchronized, which means that another thread may encounter an error and modify {{editStreams}} by say calling {{processIOError()}}. Then the iterating process in {{logSync()}} will break with {{IndexOutOfBoundException}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3196) Implement JournalListener for writing journal to local disk
[ https://issues.apache.org/jira/browse/HDFS-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253062#comment-13253062 ] Suresh Srinivas commented on HDFS-3196: --- Comments: # JournalDiskWriter #* Looks like it is incomplete. Please add TODO items at least. #* When is editlog instantiated, you need directories where editlogs are written etc. Currently editlog is null. #* Please add tests for JournalDiskWriter. # I had renamed # Unrelated to this patch, can you remove unneccessary IOException thrown frmo FSEditLog#namenodeStartedLogSegment. # I prefer calling namenodeStartedLogSegment to startLogSegment and indicate in the comments that the remote namenode started a new log segment. I like the idea of refactoring the code in trunk first and bringing it in. Implement JournalListener for writing journal to local disk --- Key: HDFS-3196 URL: https://issues.apache.org/jira/browse/HDFS-3196 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3196_20120412.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3256) HDFS considers blocks under-replicated if topology script is configured with only 1 rack
[ https://issues.apache.org/jira/browse/HDFS-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252046#comment-13252046 ] Suresh Srinivas commented on HDFS-3256: --- Aaron, could you please describe the solution along with the patch. This helps folks who do not look at the code understand the solution and the behavior of HDFS. HDFS considers blocks under-replicated if topology script is configured with only 1 rack Key: HDFS-3256 URL: https://issues.apache.org/jira/browse/HDFS-3256 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3256.patch HDFS treats the mere presence of a topology script being configured as evidence that there are multiple racks. If there is in fact only a single rack, the NN will try to place the blocks on at least two racks, and thus blocks will be considered to be under-replicated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3257) Fix synchronization issues with journal service
[ https://issues.apache.org/jira/browse/HDFS-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252079#comment-13252079 ] Suresh Srinivas commented on HDFS-3257: --- Hari, JournalService RPC server runs using a single thread. That means all the RPC method calls end up as corresponding protocol interface calls and are dispatched to listeners. The only synchronization issue I can think of are the ones that manipulate the state, which I think is handled correctly in StateMachine. Nicholas, any comments? Fix synchronization issues with journal service --- Key: HDFS-3257 URL: https://issues.apache.org/jira/browse/HDFS-3257 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Hari Mankude Assignee: Hari Mankude -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3121) hdfs tests for HADOOP-8014
[ https://issues.apache.org/jira/browse/HDFS-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247394#comment-13247394 ] Suresh Srinivas commented on HDFS-3121: --- I should wait for HADOOP-8014 to be committed before committing this, right? hdfs tests for HADOOP-8014 -- Key: HDFS-3121 URL: https://issues.apache.org/jira/browse/HDFS-3121 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.2, 0.23.3 Reporter: John George Assignee: John George Attachments: hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch This JIRA is to write tests for viewing quota using viewfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3121) hdfs tests for HADOOP-8014
[ https://issues.apache.org/jira/browse/HDFS-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247391#comment-13247391 ] Suresh Srinivas commented on HDFS-3121: --- +1 for the patch. I will commit it soon. hdfs tests for HADOOP-8014 -- Key: HDFS-3121 URL: https://issues.apache.org/jira/browse/HDFS-3121 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.2, 0.23.3 Reporter: John George Assignee: John George Attachments: hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch This JIRA is to write tests for viewing quota using viewfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3136) Multiple SLF4J binding warning
[ https://issues.apache.org/jira/browse/HDFS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247397#comment-13247397 ] Suresh Srinivas commented on HDFS-3136: --- +1. I will commit the patch soon. Multiple SLF4J binding warning -- Key: HDFS-3136 URL: https://issues.apache.org/jira/browse/HDFS-3136 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: HDFS-3136.patch This is the HDFS portion of HADOOP-8005. HDFS no longer depends upon slf4j, so removing it from the assembly will eliminate the HDFS-portion of the multiple SLF4J warnings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3202) NamespaceInfo PB translation drops build version
[ https://issues.apache.org/jira/browse/HDFS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247400#comment-13247400 ] Suresh Srinivas commented on HDFS-3202: --- bq. Suresh, if you feel strongly about removing one of the NamespaceInfo constructor variants, I'd be happy to do that in a follow-up JIRA. Just wanted to avoid the many variants of constructors and methods. Our code is replete with this :-) But no issues with the patch though. NamespaceInfo PB translation drops build version Key: HDFS-3202 URL: https://issues.apache.org/jira/browse/HDFS-3202 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 2.0.0 Attachments: HDFS-3202.patch The PBHelper#convert(NamespaceInfoProto) function doesn't pass the build version from the NamespaceInfoProto to the created NamespaceInfo object. Instead, the NamespaceInfo constructor gets the build version using the static function Storage#getBuildVersion. DNs also use this static function to determine their own build version. This means that the check the DN does to compare its own build version to that of the NN always passes, regardless of what build version exists on the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3185) Setup configuration for Journal Manager and Journal Services
[ https://issues.apache.org/jira/browse/HDFS-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247640#comment-13247640 ] Suresh Srinivas commented on HDFS-3185: --- +1 for the patch. I will commit this soon. Setup configuration for Journal Manager and Journal Services Key: HDFS-3185 URL: https://issues.apache.org/jira/browse/HDFS-3185 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: hdfs-3185-3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3211) JournalProtocol changes required for introducing epoch and fencing
[ https://issues.apache.org/jira/browse/HDFS-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247798#comment-13247798 ] Suresh Srinivas commented on HDFS-3211: --- bq. Hi Suresh. Have you looked at HDFS-3189? Hopefully we can make our protocols similar with the intent of eventually merging the two implementations. I looked at it briefly. I think this patch should cover the changes you were making. Some differences - newEpoch is called fence. We should finish HDFS-3077 discussions, and if changes come out of it, we could make further changes to the protocol. JournalProtocol changes required for introducing epoch and fencing -- Key: HDFS-3211 URL: https://issues.apache.org/jira/browse/HDFS-3211 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: Shared journals (HDFS-3092) Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3211.txt, HDFS-3211.txt JournalProtocol changes to introduce epoch in every request. Adding new method fence for fencing a JournalService. On BackupNode fence is a no-op. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3211) JournalProtocol changes required for introducing epoch and fencing
[ https://issues.apache.org/jira/browse/HDFS-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247814#comment-13247814 ] Suresh Srinivas commented on HDFS-3211: --- bq. You need to store the epoch persistently on disk to handle the case of journal daemon restarts, I think. HDFS-3190 does a refactor to add a utility class you can use for this I think that can be done in another jira. I was thinking of doing that in startLogSegment. I think we are still discussing that. Once it is resolved, lets do a separate jira. Also can you please look HDFS-3204. If you are busy, I will see if I can get it reviewed by Nicholas. JournalProtocol changes required for introducing epoch and fencing -- Key: HDFS-3211 URL: https://issues.apache.org/jira/browse/HDFS-3211 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: Shared journals (HDFS-3092) Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3211.txt, HDFS-3211.txt JournalProtocol changes to introduce epoch in every request. Adding new method fence for fencing a JournalService. On BackupNode fence is a no-op. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3212) Persist the epoch received by the JournalService
[ https://issues.apache.org/jira/browse/HDFS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247836#comment-13247836 ] Suresh Srinivas commented on HDFS-3212: --- There is some discussion in HDFS-3077 about this. Currently two alternatives under consideration are: # Use the record we write during starting of a log segment to record the epoch. #* On fence method call, a JournalService promises not to accept any other requests from old active. #* After fence, the next call is to roll, when a new log segment is created. JournalService records in this record the epoch. #* This fits in nicely with every log segment belongs to a single epoch. # Use a separate metadata file to record epoch. Based on discussions in 3077, lets choose one of the options. Persist the epoch received by the JournalService Key: HDFS-3212 URL: https://issues.apache.org/jira/browse/HDFS-3212 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: Shared journals (HDFS-3092) Reporter: Suresh Srinivas epoch received over JournalProtocol should be persisted by JournalService. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3211) JournalProtocol changes required for introducing epoch and fencing
[ https://issues.apache.org/jira/browse/HDFS-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247838#comment-13247838 ] Suresh Srinivas commented on HDFS-3211: --- Created HDFS-3212 for persisting epoch. JournalProtocol changes required for introducing epoch and fencing -- Key: HDFS-3211 URL: https://issues.apache.org/jira/browse/HDFS-3211 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: Shared journals (HDFS-3092) Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3211.txt, HDFS-3211.txt JournalProtocol changes to introduce epoch in every request. Adding new method fence for fencing a JournalService. On BackupNode fence is a no-op. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3212) Persist the epoch received by the JournalService
[ https://issues.apache.org/jira/browse/HDFS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247880#comment-13247880 ] Suresh Srinivas commented on HDFS-3212: --- bq. I don't think it's reasonable to put the epoch number inside the START transaction, because that leaks the idea of epochs out of the journal manager layer into the NN layer. I do not understand what you mean by NN layer. Epoch is a notion from JournalManager to the JournalNode. Both need to understand this and provide appropriate guarantees. bq. Also, if the JN restarts, when it comes up, how do you make sure that an old NN doesn't come back to life with a startLogSegment transaction? Can you give me an example. I am not sure I understand the issue. Persist the epoch received by the JournalService Key: HDFS-3212 URL: https://issues.apache.org/jira/browse/HDFS-3212 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: Shared journals (HDFS-3092) Reporter: Suresh Srinivas epoch received over JournalProtocol should be persisted by JournalService. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3178) Add states for journal synchronization in journal daemon
[ https://issues.apache.org/jira/browse/HDFS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248068#comment-13248068 ] Suresh Srinivas commented on HDFS-3178: --- Yes. JournalService initial code was already in trunk. This is additional change to that. What is the concern? Most 3092 tasks adding other functionality is marked for that branch? Add states for journal synchronization in journal daemon Key: HDFS-3178 URL: https://issues.apache.org/jira/browse/HDFS-3178 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 3.0.0 Attachments: h3178_20120403_svn_mv.patch, h3178_20120404.patch, h3178_20120404_svn_mv.patch, h3178_20120404b_svn_mv.patch, h3178_20120405.patch, h3178_20120405_svn_mv.patch, svn_mv.sh Journal in a new daemon has to be synchronized to the current transaction. It requires new states such as WaitingForRoll, Syncing and Synced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3178) Add states for journal synchronization in journal daemon
[ https://issues.apache.org/jira/browse/HDFS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246058#comment-13246058 ] Suresh Srinivas commented on HDFS-3178: --- Nicholas, if you are going to check for all states in StateMachine, you might as well move that code into JournalService. Rest of the code looks good. Add states for journal synchronization in journal daemon Key: HDFS-3178 URL: https://issues.apache.org/jira/browse/HDFS-3178 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3178_20120403_svn_mv.patch, svn_mv.sh Journal in a new daemon has to be synchronized to the current transaction. It requires new states such as WaitingForRoll, Syncing and Synced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3192) Active NN should exit when it has not received a getServiceStatus() rpc from ZKFC for timeout secs
[ https://issues.apache.org/jira/browse/HDFS-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246059#comment-13246059 ] Suresh Srinivas commented on HDFS-3192: --- Hari, agree with Aaron that this should not be a subtask of HDFS-3092. Active NN should exit when it has not received a getServiceStatus() rpc from ZKFC for timeout secs -- Key: HDFS-3192 URL: https://issues.apache.org/jira/browse/HDFS-3192 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Hari Mankude Assignee: Hari Mankude -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3121) hdfs tests for HADOOP-8014
[ https://issues.apache.org/jira/browse/HDFS-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246724#comment-13246724 ] Suresh Srinivas commented on HDFS-3121: --- Comments: # Not very clear THis test is test the fix.. Instead of describing running into serialization problem etc. can you just describe what the test is. Also please move the description as class level javadoc. # minor: defaultBLockSize - defaultBlockSize # Please ensure line lengths are with in 80 chars limit # testGetDefaultBlockSize() indententation at try is not correct. I am also not clear about the comment createFile... What has createFile got to do with the test? # Please add a brief description of what the test is testing. # Please consider: When expecting an exception in tests, you can move fail() within try clause, after the method that you expect exception from. This avoids also return from catch blocks. hdfs tests for HADOOP-8014 -- Key: HDFS-3121 URL: https://issues.apache.org/jira/browse/HDFS-3121 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.2, 0.23.3 Reporter: John George Assignee: John George Attachments: hdfs-3121.patch, hdfs-3121.patch, hdfs-3121.patch This JIRA is to write tests for viewing quota using viewfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3178) Add states for journal synchronization in journal daemon
[ https://issues.apache.org/jira/browse/HDFS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246832#comment-13246832 ] Suresh Srinivas commented on HDFS-3178: --- Adding it to the design document is a good idea. Add states for journal synchronization in journal daemon Key: HDFS-3178 URL: https://issues.apache.org/jira/browse/HDFS-3178 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3178_20120403_svn_mv.patch, h3178_20120404.patch, h3178_20120404_svn_mv.patch, svn_mv.sh Journal in a new daemon has to be synchronized to the current transaction. It requires new states such as WaitingForRoll, Syncing and Synced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3202) NamespaceInfo PB translation drops build version
[ https://issues.apache.org/jira/browse/HDFS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246936#comment-13246936 ] Suresh Srinivas commented on HDFS-3202: --- +1 for the patch. I think this is trivial enough and unit test is an overkill. One quick comment - do we need second variant of the NamespaceInfo constructor? If it is not used at many places, can we just have one constructor? NamespaceInfo PB translation drops build version Key: HDFS-3202 URL: https://issues.apache.org/jira/browse/HDFS-3202 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3202.patch The PBHelper#convert(NamespaceInfoProto) function doesn't pass the build version from the NamespaceInfoProto to the created NamespaceInfo object. Instead, the NamespaceInfo constructor gets the build version using the static function Storage#getBuildVersion. DNs also use this static function to determine their own build version. This means that the check the DN does to compare its own build version to that of the NN always passes, regardless of what build version exists on the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3178) Add states for journal synchronization in journal daemon
[ https://issues.apache.org/jira/browse/HDFS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246969#comment-13246969 ] Suresh Srinivas commented on HDFS-3178: --- Minor comments: # For started state, RPC server is started - service ready to receive requests from namenode, better? # instead of fenced with a namenode, fenced bya namenode is better. +1 for the patch. Add states for journal synchronization in journal daemon Key: HDFS-3178 URL: https://issues.apache.org/jira/browse/HDFS-3178 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3178_20120403_svn_mv.patch, h3178_20120404.patch, h3178_20120404_svn_mv.patch, h3178_20120404b_svn_mv.patch, svn_mv.sh Journal in a new daemon has to be synchronized to the current transaction. It requires new states such as WaitingForRoll, Syncing and Synced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3178) Add states for journal synchronization in journal daemon
[ https://issues.apache.org/jira/browse/HDFS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246995#comment-13246995 ] Suresh Srinivas commented on HDFS-3178: --- bya = by a Add states for journal synchronization in journal daemon Key: HDFS-3178 URL: https://issues.apache.org/jira/browse/HDFS-3178 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3178_20120403_svn_mv.patch, h3178_20120404.patch, h3178_20120404_svn_mv.patch, h3178_20120404b_svn_mv.patch, svn_mv.sh Journal in a new daemon has to be synchronized to the current transaction. It requires new states such as WaitingForRoll, Syncing and Synced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3199) TestValidateConfigurationSettings is failing
[ https://issues.apache.org/jira/browse/HDFS-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247011#comment-13247011 ] Suresh Srinivas commented on HDFS-3199: --- Eli, can you add link to where this test is failing. I do not seem to find at least in precommit builds failures of this test? TestValidateConfigurationSettings is failing Key: HDFS-3199 URL: https://issues.apache.org/jira/browse/HDFS-3199 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Todd Lipcon Fix For: 2.0.0 Attachments: hdfs-3199.txt TestValidateConfigurationSettings is failing on every run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247020#comment-13247020 ] Suresh Srinivas commented on HDFS-3150: --- Given there are some discussions happening around +1s from committer, it is probably a good idea to wait for +1. Should we also keep release manager posted about this change? I generally post an email to hdfs/common dev about this kind of changes. Add option for clients to contact DNs via hostname in branch-1 -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt Per the document attached to HADOOP-8198, this is just for branch-1, and unbreaks DN multihoming. The datanode can be configured to listen on a bond, or all interfaces by specifying the wildcard in the dfs.datanode.*.address configuration options, however per HADOOP-6867 only the source address of the registration is exposed to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming. In order to fix it let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP I'd like to go with approach #2 as it does not require making an incompatible change to the client protocol, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) based on the context the ID is being used in, vs always using the IP:xferPort as the Datanode's name, and using the name everywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3199) TestValidateConfigurationSettings is failing
[ https://issues.apache.org/jira/browse/HDFS-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247038#comment-13247038 ] Suresh Srinivas commented on HDFS-3199: --- Cool. Here is a failing test link: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//testReport/ TestValidateConfigurationSettings is failing Key: HDFS-3199 URL: https://issues.apache.org/jira/browse/HDFS-3199 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Todd Lipcon Fix For: 2.0.0 Attachments: hdfs-3199.txt TestValidateConfigurationSettings is failing on every run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2998) OfflineImageViewer and ImageVisitor should be annotated public
[ https://issues.apache.org/jira/browse/HDFS-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247040#comment-13247040 ] Suresh Srinivas commented on HDFS-2998: --- I am not sure why these classes should be public. Backward compatibility is not what makes some thing public. It is also the intent to open it up for external developers. I do not believe these classes need to be. Also we need to address the previous comment: bq. We could consider it making public. We should make all the other classes not marked public, but are referenced by OIV public also. OfflineImageViewer and ImageVisitor should be annotated public -- Key: HDFS-2998 URL: https://issues.apache.org/jira/browse/HDFS-2998 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.23.1 Reporter: Aaron T. Myers The OfflineImageViewer is currently annotated as InterfaceAudience.Private. It's intended for subclassing, so it should be annotated as the public API that it is. The ImageVisitor class should similarly be annotated public (evolving is fine). Note that it should also be changed to be public; it's currently package-private, which means that users have to cheat with their subclass package name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3148) The client should be able to use multiple local interfaces for data transfer
[ https://issues.apache.org/jira/browse/HDFS-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245314#comment-13245314 ] Suresh Srinivas commented on HDFS-3148: --- Hey guys, can you do this work in a separate branch as well. There are too many things going on to catchup on things. I have not had time to look into the proposal and my feeling was, is this complexity worth adding. Though I have not had time to think about how much complexity this feature adds. Also, is Daryn's concern addressed? The client should be able to use multiple local interfaces for data transfer Key: HDFS-3148 URL: https://issues.apache.org/jira/browse/HDFS-3148 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs client Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.0.0 Attachments: hdfs-3148-b1.txt, hdfs-3148-b1.txt, hdfs-3148.txt, hdfs-3148.txt, hdfs-3148.txt HDFS-3147 covers using multiple interfaces on the server (Datanode) side. Clients should also be able to utilize multiple *local* interfaces for outbound connections instead of always using the interface for the local hostname. This can be accomplished with a new configuration parameter ({{dfs.client.local.interfaces}}) that accepts a list of interfaces the client should use. Acceptable configuration values are the same as the {{dfs.datanode.available.interfaces}} parameter. The client binds its socket to a specific interface, which enables outbound traffic to use that interface. Binding the client socket to a specific address is not sufficient to ensure egress traffic uses that interface. Eg if multiple interfaces are on the same subnet the host requires IP rules that use the source address (which bind sets) to select the destination interface. The SO_BINDTODEVICE socket option could be used to select a specific interface for the connection instead, however it requires JNI (is not in Java's SocketOptions) and root access, which we don't want to require clients have. Like HDFS-3147, the client can use multiple local interfaces for data transfer. Since the client already cache their connections to DNs choosing a local interface at random seems like a good policy. Users can also pin a specific client to a specific interface by specifying just that interface in dfs.client.local.interfaces. This change was discussed in HADOOP-6210 a while back, and is actually useful/independent of the other HDFS-3140 changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245346#comment-13245346 ] Suresh Srinivas commented on HDFS-3092: --- bq. Your design does not seem to consider the possibility of multiple concurrent logs, which you may want to have for federation. For HDFS editlogs, my feeling is that there will only be three JDs. One on the active namenode, second on the standby and a third JD on one of the machines. In federation, one has to configure a JD per Federated namespace. Alternative is to use BookKeeper, since it could make the deployment simpler for federated large cluster. bq. There has been comments about comparing the different approaches discussed, and I was wondering what criteria you have been thinking of using to compare them. I think the comment was more about comparing the design and complexity of deployment and not benchmarks for two systems. Performance is not the motivation for this jira. bq. I was wondering about how reads to the log are executed if writes only have to reach a majority quorum. Once it is time to read, how does the reader gets a consistent view of the log? One JD alone may not have all entries, so I suppose the reader may need to read from multiple JDs to get a consistent view? Do the transaction identifiers establish the order of entries in the log? One quick note is that I don't see why a majority is required; bk does not require a majority. We decided on majority quorum to keep the design simple, though it is strictly not necessary. A JD in JournalList is supposed to have all the entries and any JD from the list can be used to read the journals. bq. Here are some notes I took comparing the bk approach with the one in this jira, in the case you're interested I noticed that as well. After we went thourgh many issues that this solution had to take care of, the solution looks very similar to BK. That is comforting :-) Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245523#comment-13245523 ] Suresh Srinivas commented on HDFS-3092: --- bq. When you say three JDs, that's the degree of replication, right? When I said multiple logs, I was referring to multiple namenodes writing to different logs, as with federation. Right, three JDs for degree of replication. However, I do understand multiple logs - that is a log per namespace. For every namespace, in federation, active and standby namenode + additional JD is needed. bq. I think my confusion here is that you require a quorum to be able to acknowledge the operation, but in reality you try to write to everyone. If you can't write to everyone, then you induce a view change (change to JournalList). Is this right? Yes. In the first cut we write to all the JDs that are active. At least quorum should be written. This can be improved in the future by waiting for only Quorum JDs. Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245588#comment-13245588 ] Suresh Srinivas commented on HDFS-3077: --- Thanks for posting the design. Now I understand your comment that there is a lot of common things between this one and the approach in HDFS-3092. Here are some high level comments: # Terminology - JournalDaemon or JournalNode. I prefer JournalDaemon because my plan was to run them in the same process space as the namenode. A JournalDeamon could also be stand-alone process. # I like the idea of quorum writes and maintaining the queue. 3092 design currently uses timeout to declare a JD slow and fail it. We were planning to punting on it until we had first implementation. # newEpoch() is called fence() in HDFS-3092. My preference is to use the name fence(). I was using version # which is called epoch. I think the name epoch sounds better. The key difference is that version # is generated from znode in HDFS-3092. So two namenodes cannot use the same epoch number. I think there is a bug with the approach you have described, stemming from the fact that two namenodes can use the same epoch and step 3 in 2.4 can be completed independent of quorum. This is shown in Hari's example. # I prefer to record epoch in startLogSegment filler record. startLogSegment record was never part of the journal, which we had added for structural reasons. So adding epoch info to it should not matter. The way I see it is - journal belongs to a segment. Segment has single version # or epoch. # In both proposals epoch or version # needs to be sent in all journal requests. We could certainly make a list of common work items and create jiras, so that many people can collaborate and wrap it up, like we did in HDFS-1623. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3077-partial.txt, qjournal-design.pdf Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245771#comment-13245771 ] Suresh Srinivas commented on HDFS-3077: --- bq. Suresh seemed to think doing it on a branch would be counter-productive to code sharing There is a branch already created for 3092. We could use that. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3077-partial.txt, qjournal-design.pdf Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245847#comment-13245847 ] Suresh Srinivas commented on HDFS-3077: --- bq. How can step 3 in section 2.4 be completed independent of quorum? Step 4 indicates that it requires a quorum of nodes to respond successfully to the newEpoch message. Here's an example: What I meant was at each JN, step 3 completes. Hence the example Hari was giving. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3077-partial.txt, qjournal-design.pdf Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245850#comment-13245850 ] Suresh Srinivas commented on HDFS-3077: --- bq. so all nodes are now taking part in the quorum. We could optionally at this point have JN3 copy over the edits_1-120 segment from one of the other nodes, but that copy can be asynchronous. It's a repair operation, but given we already have 2 valid replicas, we aren't in any imminent danger of data loss. The proposal in HDFS-3092 is to make the JN3 part of the quorum, only when it has caught up with other JNs. Having this simplify some boundary conditions. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3077-partial.txt, qjournal-design.pdf Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245910#comment-13245910 ] Suresh Srinivas commented on HDFS-3077: --- For section 2.5.x, the document posted needs to consider different sets of quorums that become available during recovery. See the newly added appendix to the design in HDFS-3092. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3077-partial.txt, qjournal-design.pdf Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3148) The client should be able to use multiple local interfaces for data transfer
[ https://issues.apache.org/jira/browse/HDFS-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245954#comment-13245954 ] Suresh Srinivas commented on HDFS-3148: --- Eli, given it might be a few jiras, I agree it might be an over kill. I will try and make time, when patch on other multihoming jiras become available. The client should be able to use multiple local interfaces for data transfer Key: HDFS-3148 URL: https://issues.apache.org/jira/browse/HDFS-3148 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs client Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.0.0 Attachments: hdfs-3148-b1.txt, hdfs-3148-b1.txt, hdfs-3148.txt, hdfs-3148.txt, hdfs-3148.txt HDFS-3147 covers using multiple interfaces on the server (Datanode) side. Clients should also be able to utilize multiple *local* interfaces for outbound connections instead of always using the interface for the local hostname. This can be accomplished with a new configuration parameter ({{dfs.client.local.interfaces}}) that accepts a list of interfaces the client should use. Acceptable configuration values are the same as the {{dfs.datanode.available.interfaces}} parameter. The client binds its socket to a specific interface, which enables outbound traffic to use that interface. Binding the client socket to a specific address is not sufficient to ensure egress traffic uses that interface. Eg if multiple interfaces are on the same subnet the host requires IP rules that use the source address (which bind sets) to select the destination interface. The SO_BINDTODEVICE socket option could be used to select a specific interface for the connection instead, however it requires JNI (is not in Java's SocketOptions) and root access, which we don't want to require clients have. Like HDFS-3147, the client can use multiple local interfaces for data transfer. Since the client already cache their connections to DNs choosing a local interface at random seems like a good policy. Users can also pin a specific client to a specific interface by specifying just that interface in dfs.client.local.interfaces. This change was discussed in HADOOP-6210 a while back, and is actually useful/independent of the other HDFS-3140 changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244310#comment-13244310 ] Suresh Srinivas commented on HDFS-3077: --- I have posted a design document to HDFS-3092. The solution uses multiple journal daemons and uses ZooKeeper for co-ordination. I believe it is much simpler than this proposal. Interested folks, please take a look at the document and provide your comments. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244361#comment-13244361 ] Suresh Srinivas commented on HDFS-3092: --- bq. Hi Suresh. I took a look at the design document, and I think it actually shares a lot with what I'm doing in HDFS-3077. Hopefully we can share some portions of the code and design. Sounds good. Will add more details on how the code will be organized, so we can better reuse the code. bq. fencing command ensure that prior NNs can no longer access the JD after it completes Fence command will include a version number that we got from the JournalList ZK node. The number that is higher wins at the JD. The fence command with lower version # is rejected. For the scenario you described, NN2 after it rolls JD2 and JD3, updates the JournalList with JD2 and JD3. JD1 will no longer be used. bq. Here are some points I think need elaboration in the design doc: We decided to keep the document light to ensure the details do not distract from core mechanism. Will add more details in the next version, including some use cases. Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244405#comment-13244405 ] Suresh Srinivas commented on HDFS-3092: --- bq. The situation here is at the beginning of a segment - for example the very first transaction. So, when NN2 rolls, the starting txid of the next segment is 1. I think you need to add an epoch number which is separate from the txid, to distinguish different startings of the same segment. This is what I was thinking - when roll is called from NN, it includes the version # from journalList ZK node. It gets recorded in the start segment transaction in the editlog. Does that work? Did I understand your comment correctly? Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1324#comment-1324 ] Suresh Srinivas commented on HDFS-3092: --- bq. Could I request that this work be done on a feature branch, as there are multiple competing proposals to fill the same need? Then, when the features are complete, we can either choose to merge all, some, or none of them, based on their merits. Sounds good. If we plan to share the code, separate branch for HDFS-3077 does not make sense either. Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode
[ https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244503#comment-13244503 ] Suresh Srinivas commented on HDFS-3092: --- bq. Could I request that this work be done on a feature branch, as there are multiple competing proposals to fill the same need? Then, when the features are complete, we can either choose to merge all, some, or none of them, based on their merits. The problem with this is, we could avoid unnecessary duplication effort. Like we did in HDFS-1623, we could divide the implementation into multiple jiras. Different people can pickup jiras and contribute. Right now I see a 3K+ line patch in HDFS-3077. Posting such a huge patch makes review, code reuse etc. difficult. For now, perhaps development can happen and we could see if the code reuse can happen later. Enable journal protocol based editlog streaming for standby namenode Key: HDFS-3092 URL: https://issues.apache.org/jira/browse/HDFS-3092 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: MultipleSharedJournals.pdf Currently standby namenode relies on reading shared editlogs to stay current with the active namenode, for namespace changes. BackupNode used streaming edits from active namenode for doing the same. This jira is to explore using journal protocol based editlog streams for the standby namenode. A daemon in standby will get the editlogs from the active and write it to local edits. To begin with, the existing standby mechanism of reading from a file, will continue to be used, instead of from shared edits, from the local edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3155) Clean up FSDataset implemenation related code.
[ https://issues.apache.org/jira/browse/HDFS-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240167#comment-13240167 ] Suresh Srinivas commented on HDFS-3155: --- Minor comment - weird formatting in DataStorage.java. Other than that patch looks good. +1. Clean up FSDataset implemenation related code. -- Key: HDFS-3155 URL: https://issues.apache.org/jira/browse/HDFS-3155 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3155_20120327.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3134) harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240181#comment-13240181 ] Suresh Srinivas commented on HDFS-3134: --- bq. It's clear that we want these exceptions to be thrown as IOException instead of as unchecked exceptions. We also want to avoid out of memory situations. From which methods? Unchecked exceptions indicate programming errors. Blindly turning them into checked exceptions is not a good idea (as you say so in some of your comments). I am not sure which part of the code you are talking about. harden edit log loader against malformed or malicious input --- Key: HDFS-3134 URL: https://issues.apache.org/jira/browse/HDFS-3134 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Currently, the edit log loader does not handle bad or malicious input sensibly. We can often cause OutOfMemory exceptions, null pointer exceptions, or other unchecked exceptions to be thrown by feeding the edit log loader bad input. In some environments, an out of memory error can cause the JVM process to be terminated. It's clear that we want these exceptions to be thrown as IOException instead of as unchecked exceptions. We also want to avoid out of memory situations. The main task here is to put a sensible upper limit on the lengths of arrays and strings we allocate on command. The other task is to try to avoid creating unchecked exceptions (by dereferencing potentially-NULL pointers, for example). Instead, we should verify ahead of time and give a more sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3125) Add a service that enables JournalDaemon
[ https://issues.apache.org/jira/browse/HDFS-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238611#comment-13238611 ] Suresh Srinivas commented on HDFS-3125: --- bq. The new patch needs to check state in journal(..) and startLogSegment(..), and verify(registration) in startLogSegment(..). Otherwise, it looks good. Currently theses states are for listener side only. The method calls above are received over RPC call. Given that RPC server could be controlled from outside, the expectation is that either Service stop has stopped RPC server or external application has stopped the server. The missing functionality currently is - Service should unregister during stop. Currently an errorReport of fatal is used as unregister. I was planning to add unregister method to the protocol. If that is not needed, I can call errorReport in service#stop() method. Add a service that enables JournalDaemon Key: HDFS-3125 URL: https://issues.apache.org/jira/browse/HDFS-3125 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3125.patch, HDFS-3125.patch, HDFS-3125.patch In this subtask, I plan to add JournalService. It will provide the following functionality: # Starts RPC server with JournalProtocolService or uses the RPC server provided and add JournalProtocol service. # Registers with the namenode. # Receives JournalProtocol related requests and hands it to over to a listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3125) Add a service that enables JournalDaemon
[ https://issues.apache.org/jira/browse/HDFS-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236828#comment-13236828 ] Suresh Srinivas commented on HDFS-3125: --- Findbugs warning is unrelated to this change and tracked in HDFS-3132. Add a service that enables JournalDaemon Key: HDFS-3125 URL: https://issues.apache.org/jira/browse/HDFS-3125 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3125.patch, HDFS-3125.patch In this subtask, I plan to add JournalService. It will provide the following functionality: # Starts RPC server with JournalProtocolService or uses the RPC server provided and add JournalProtocol service. # Registers with the namenode. # Receives JournalProtocol related requests and hands it to over to a listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3125) Add a service that enable JournalDaemon
[ https://issues.apache.org/jira/browse/HDFS-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235018#comment-13235018 ] Suresh Srinivas commented on HDFS-3125: --- Yes. I plan to use this for BackupNode as well. Add a service that enable JournalDaemon --- Key: HDFS-3125 URL: https://issues.apache.org/jira/browse/HDFS-3125 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3125.patch In this subtask, I plan to add JournalService. It will provide the following functionality: # Starts RPC server with JournalProtocolService or uses the RPC server provided and add JournalProtocol service. # Registers with the namenode. # Receives JournalProtocol related requests and hands it to over to a listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233768#comment-13233768 ] Suresh Srinivas commented on HDFS-3107: --- bq. I must have missed a smiley Thats okay. You missed the smiley in the tweet too. bq. This is very common. I see I was not aware it was that common. HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Lei Chang Attachments: HDFS_truncate_semantics_Mar15.pdf Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3086) Change Datanode not to send storage list in registration - it will be sent in block report
[ https://issues.apache.org/jira/browse/HDFS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234080#comment-13234080 ] Suresh Srinivas commented on HDFS-3086: --- Patch looks good. +1. Change Datanode not to send storage list in registration - it will be sent in block report -- Key: HDFS-3086 URL: https://issues.apache.org/jira/browse/HDFS-3086 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3086_20120320.patch, h3086_20120320b.patch When a datnode is registered, the datanode send also the storage lists. It is not useful since the storage list is already available in block reports. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery
[ https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232737#comment-13232737 ] Suresh Srinivas commented on HDFS-3105: --- Comments: # Not sure how UpdateReplicaUnderRecoveryResponseProto can have storage instead of block? Also do you need DatanodeStorage or just storageID sufficient? # Please do not update service protocol version, as this is with in a release. This is not used any more and we need to clean this up at some point in time. Add DatanodeStorage information to block recovery - Key: HDFS-3105 URL: https://issues.apache.org/jira/browse/HDFS-3105 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3105_20120315.patch, h3105_20120315b.patch, h3105_20120316.patch, h3105_20120316b.patch When recovering a block, the namenode and client do not have the datanode storage information of the block. So namenode cannot add the block to the corresponding datanode storge block list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery
[ https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232873#comment-13232873 ] Suresh Srinivas commented on HDFS-3105: --- +1 for the patch. Add DatanodeStorage information to block recovery - Key: HDFS-3105 URL: https://issues.apache.org/jira/browse/HDFS-3105 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3105_20120315.patch, h3105_20120315b.patch, h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch When recovering a block, the namenode and client do not have the datanode storage information of the block. So namenode cannot add the block to the corresponding datanode storge block list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233134#comment-13233134 ] Suresh Srinivas commented on HDFS-3107: --- bq. if a user mistakenly starts to append data to an existing large file, and discovers the mistake, the only recourse is to recreate that file, by rewriting the contents. This is very inefficient. What if user accidentally truncates a file :-) HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Lei Chang Attachments: HDFS_truncate_semantics_Mar15.pdf Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3087) Decomissioning on NN restart can complete without blocks being replicated
[ https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230357#comment-13230357 ] Suresh Srinivas commented on HDFS-3087: --- Kihwal, this is a good bug find. We should fix this. This problem is not that serious. Prior to 0.23, we shutdown the datanode post decommission completed. After HDFS-1547 we do not shutdown the DN any more. The DN continues to shown as decommissioned. The expectation is, an Admin can at a later time shutdown the decommissioned DNs and proceed with maintenance of the node. Given this the question is, after we mark DN as decommissioned, when block report comes in, what happens? I suspect we moving back to decom in progress. How about using the flag that DatanodeDescriptor has for tracking first block report. We should not mark a DN as decommissioned, if block report is not received. I also agree that we should not be marking any thing as decommissioned, until we come out of safemode. Decomissioning on NN restart can complete without blocks being replicated - Key: HDFS-3087 URL: https://issues.apache.org/jira/browse/HDFS-3087 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0, 0.24.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.3 If a data node is added to the exclude list and the name node is restarted, the decomissioning happens right away on the data node registration. At this point the initial block report has not been sent, so the name node thinks the node has zero blocks and the decomissioning completes very quick, without replicating the blocks on that node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230629#comment-13230629 ] Suresh Srinivas commented on HDFS-3077: --- bq. but like Einstein said, no simpler! Its all relative :-) BTW it would be good write design for this. That avoid lenghty comments and keeps the summary of what is proposed in place, instead of scattering in multiple comments. bq. This is mostly great – so long as you have an external fencing strategy which prevents the old active from attempting to continue to write after the new active is trying to read. External fencing is not needed, given active daemons having ability to fence. bq. it gets the loggers to promise not to accept edits from the old active The daemons can stop accepting writes when it realizes that active lock is no longer held by the writer. Clearly an advantage of an active daemon compared to using passive storage. bq. But, we still have one more problem: given some txid N, we might have multiple actives that have tried to write the same transaction ID. Example scenario: The case of writes making it though some daemons can also be solved. The writes that have made through W daemons wins. The others are marked not in sync and need to sync up. Explanation to follow. The solution we are building is specific to namenode editlogs. There is only one active writer (as Ivan brought up earlier). Here is the outline I am thinking of. Lets start with steady state with K of N journal deamons. When a journal daemon fails, we roll the edits. When a journal daemon joins, we roll the edits. New journal daemon could start syncing other finalized edits, while keeping track of edits in progress. We also keep track of the list of the active daemons in zookeeper. Rolling gives a logical point for newly joined daemon to sync up (sort of like generation stamp). During failover, the new active, gets from the actively written journals, the point to which it has to sync up to. It then rolls the edits also to that point. Rolling also gives you a way to discard extra journal records that made it to W daemons, during failover. When there are overlapping records, say e1-105 and e100-200, you read 100-105 from the second editlog, and discard it from the first editlog. Again there are scenarios that are missing here. I plan to post more details in a design on this. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229196#comment-13229196 ] Suresh Srinivas commented on HDFS-3077: --- Todd, as indicated yesterday, I have created HDFS-3092, to contribute the early prototype that uses journal protocol on the standby. This could be evolved to the stand alone daemon you mention in first part of your description. I agree with most of the description in this jira, as we were thinking along the same lines. However, I am not sure I understand the benefits of implementing ZAB are. My preferences is to keep editlog path as simple as possible. Currently we have multiple copies of edits. Chosing the right one is an issue. That could be solved in simple ways - instead of adding complexity to this layer. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229325#comment-13229325 ] Suresh Srinivas commented on HDFS-1623: --- Minutes from HDFS Namenode HA - Next Steps meeting: We had a meeting to discuss the status of Namenode HA and remaining work items. Attendees included Aaron Myers, Eli Collins, Hairong Kuang, Jitendra Pandey, Hari Mankude, Pritam Damania, Suresh Srinivas, Sanjay Radia, Tomasz Nykiel, Todd Lipcon The following topics were discussed: *Clientside failover* # DFSClient failover - Currently configuration based failover is available. We decided we will consider including ZK and DNS based failovers, along the lines of configuration based failover. See HDFS-2839 for details. # IP failover was discussed and we decided that it is an option that will be added. # There was discussion around using NameServiceID as the logical URL. We need to use an appropriate abstraction here. This discussion will continue on HDFS-2839. *Failover Controller* In HDFS-1623 design, the failover controller is a separate process. We discussed whether we should incorporate it with in NN, for now. Decision was to continue with the design from HDFS-1623 and keep it as a separate process. *Use of BackupNode and Journal protocol* Current HA implementation uses an NFS shared storage. In order to eliminate this need, daemons based on Journal protocol that receives streaming edits from active namenode could be used. Some activity around using this in standby namenode and also run such stand alone daemons is starting. See HDFS-3077 and HDFS-3092 for details. Folks, please add if I missed any thing. High Availability Framework for HDFS NN --- Key: HDFS-1623 URL: https://issues.apache.org/jira/browse/HDFS-1623 Project: Hadoop HDFS Issue Type: New Feature Reporter: Sanjay Radia Fix For: 0.24.0, 0.23.3 Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229715#comment-13229715 ] Suresh Srinivas commented on HDFS-1623: --- BTW in the meeting minutes, in list of attendees, I left out Konstantin Shvachko, Colin McCabe and Mayank Bansal. High Availability Framework for HDFS NN --- Key: HDFS-1623 URL: https://issues.apache.org/jira/browse/HDFS-1623 Project: Hadoop HDFS Issue Type: New Feature Reporter: Sanjay Radia Fix For: 0.24.0, 0.23.3 Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229811#comment-13229811 ] Suresh Srinivas commented on HDFS-3005: --- FSVolume is synchronized by FSDataset. However FSVolume#checkDirs() is synchronized by FSVolumeset. So either we fix that issue in this jira or leave the TODO back and fix it in another jira. ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229838#comment-13229838 ] Suresh Srinivas commented on HDFS-3005: --- +1 for the change ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging
[ https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226701#comment-13226701 ] Suresh Srinivas commented on HDFS-3056: --- +1 for the patch. Add an interface for DataBlockScanner logging - Key: HDFS-3056 URL: https://issues.apache.org/jira/browse/HDFS-3056 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3056_20120306.patch, h3056_20120307.patch, h3056_20120307b.patch Some methods in the FSDatasetInterface are used only for logging in DataBlockScanner. These methods should be separated out to an new interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3021) Use generic type to declare FSDatasetInterface
[ https://issues.apache.org/jira/browse/HDFS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220243#comment-13220243 ] Suresh Srinivas commented on HDFS-3021: --- Comments (some unrelated to the changes made by the patch): # Can you please remove FSDatasetInterface.java BlockWriteStreams#close() throwing unnecessary IOException # Remove TimeoutException thrown from TestDatanodeVolumeFailureToleration#testVolumeConfig # FSDataset.java - Remove unnecessary cast (BlockVolumeChoosingPolicyFSVolume)ReflectionUtils # DatanodeTestUtils - use dn.getFSDataset() instead of directly using dn.data Use generic type to declare FSDatasetInterface -- Key: HDFS-3021 URL: https://issues.apache.org/jira/browse/HDFS-3021 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3021_20120227.patch, h3021_20120228.patch Currently we have to cast FSVolumeInterface to FSVolume in FSDataset. Using generic type could avoid it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3034) Remove the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/HDFS-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220247#comment-13220247 ] Suresh Srinivas commented on HDFS-3034: --- +1 for the patch. Remove the deprecated Syncable.sync() method Key: HDFS-3034 URL: https://issues.apache.org/jira/browse/HDFS-3034 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3034_20120229.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218397#comment-13218397 ] Suresh Srinivas commented on HDFS-1623: --- The difference between HA-on and HA-off is that the HA-on mode actually fsyncs all of these block allocations. Shoudld the bench mark be re-run with HDFS-3020. It might bring HA On close to HA Off? High Availability Framework for HDFS NN --- Key: HDFS-1623 URL: https://issues.apache.org/jira/browse/HDFS-1623 Project: Hadoop HDFS Issue Type: New Feature Reporter: Sanjay Radia Assignee: Sanjay Radia Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3004) Create Offline NameNode recovery tool
[ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218416#comment-13218416 ] Suresh Srinivas commented on HDFS-3004: --- Colin, good writeup. Lets consider two scenarios: # The last entry in the editlog is corrupt (most likely because process is not shutdown cleanly). # The editlog entry in the middle is corrupt, followed by clean entries (very unlikely). The first one is easy to handle. The recovery tool prints an error with information about the total size of the editlog and the offset where an error is encountered. If it is close enough to the end of the file, then operator knows only last few records are not valid. For the second one, one may not be able to continue to read the edits at all, past that point. How can we handle this? We could add periodic markers in the editlog and skip to the next marker on this type of error. Still, it is possible that the namespace is inconsistent that we cannot load the edits. Typcially Namenode is configured to store edits in multiple directories. The tool should handle this. If one of the copies is corrupt and the other is not, it should indicate the same. Create Offline NameNode recovery tool - Key: HDFS-3004 URL: https://issues.apache.org/jira/browse/HDFS-3004 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3004__namenode_recovery_tool.txt We've been talking about creating a tool which can process NameNode edit logs and image files offline. This tool would be similar to a fsck for a conventional filesystem. It would detect inconsistencies and malformed data. In cases where it was possible, and the operator asked for it, it would try to correct the inconsistency. It's probably better to call this nameNodeRecovery or similar, rather than fsck, since we already have a separate and unrelated mechanism which we refer to as fsck. The use case here is that the NameNode data is corrupt for some reason, and we want to fix it. Obviously, we would prefer never to get in this case. In a perfect world, we never would. However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations. In the past we have had to correct it manually, which is time-consuming and which can result in downtime. I would like to reuse as much code as possible from the NameNode in this tool. Hopefully, the effort that is spent developing this will also make the NameNode editLog and image processing even more robust than it already is. Another approach that we have discussed is NOT having an offline tool, but just having a switch supplied to the NameNode, like —auto-fix or —force-fix. In that case, the NameNode would attempt to guess when data was missing or incomplete in the EditLog or Image-- rather than aborting as it does now. Like the proposed fsck tool, this switch could be used to get users back on their feet quickly after a problem developed. I am not in favor of this approach, because there is a danger that users could supply this flag in cases where it is not appropriate. This risk does not exist for an offline fsck tool, since it would have to be run explicitly. However, I wanted to mention this proposal here for completeness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3020) Auto-logSync based on edit log buffer size broken
[ https://issues.apache.org/jira/browse/HDFS-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218859#comment-13218859 ] Suresh Srinivas commented on HDFS-3020: --- Sounds good Auto-logSync based on edit log buffer size broken - Key: HDFS-3020 URL: https://issues.apache.org/jira/browse/HDFS-3020 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Attachments: hdfs-3020.txt, hdfs-3020.txt HDFS-1112 added a feature whereby the edit log automatically calls logSync() if the buffered data crosses a threshold. However, the code checks {{bufReady.size()}} rather than {{bufCurrent.size()}} -- which is incorrect since the writes themselves go into {{bufCurrent}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3008) Negative caching of local addrs doesn't work
[ https://issues.apache.org/jira/browse/HDFS-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215630#comment-13215630 ] Suresh Srinivas commented on HDFS-3008: --- +1 for the patch. Negative caching of local addrs doesn't work Key: HDFS-3008 URL: https://issues.apache.org/jira/browse/HDFS-3008 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1, 1.1.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3008.txt HDFS-2653 added negative caching of local addrs, however it still goes through the fall through path every time if the address is non-local. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3009) DFSclient islocaladdress() can use similar routine in netutils
[ https://issues.apache.org/jira/browse/HDFS-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215832#comment-13215832 ] Suresh Srinivas commented on HDFS-3009: --- +1. The code in method replacing the code is identical. No need for adding unit tests, since it is already covered in TestNetUtils. DFSclient islocaladdress() can use similar routine in netutils -- Key: HDFS-3009 URL: https://issues.apache.org/jira/browse/HDFS-3009 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0, 0.24.0 Reporter: Hari Mankude Assignee: Hari Mankude Priority: Trivial Attachments: HDFS-3009.patch isLocalAddress() in dfsclient can use similar function in netutils -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2978) The NameNode should expose name dir statuses via JMX
[ https://issues.apache.org/jira/browse/HDFS-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216101#comment-13216101 ] Suresh Srinivas commented on HDFS-2978: --- Minor comment - In NameNodeMXBean.java can you please elaborate on what name dirs mean in the javadoc. +1 for the patch. The NameNode should expose name dir statuses via JMX Key: HDFS-2978 URL: https://issues.apache.org/jira/browse/HDFS-2978 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-2978.patch We currently display this info on the NN web UI, so users who wish to monitor this must either do it manually or parse HTML. We should publish this information via JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3006) Webhdfs SETOWNER call returns incorrect content-type
[ https://issues.apache.org/jira/browse/HDFS-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216142#comment-13216142 ] Suresh Srinivas commented on HDFS-3006: --- Minor comment: One line in DatanodeWebHdfsMethods.java and a couple in NamenodeWebHdfsMethods.java 80 chars. +1 for the patch. Webhdfs SETOWNER call returns incorrect content-type -- Key: HDFS-3006 URL: https://issues.apache.org/jira/browse/HDFS-3006 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: bc Wong Assignee: Tsz Wo (Nicholas), SZE Attachments: h3006_20120223.patch, h3006_20120224.patch, h3006_20120224b.patch, h3006_20120224c.patch, h3006_20120224c_branch1.patch The SETOWNER call returns an empty body. But the header has Content-Type: application/json, which is a contradiction (empty string is not valid json). This appears to happen for SETTIMES and SETPERMISSION as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2978) The NameNode should expose name dir statuses via JMX
[ https://issues.apache.org/jira/browse/HDFS-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216221#comment-13216221 ] Suresh Srinivas commented on HDFS-2978: --- Exposing Strings and primitives is straight forward in JMX. But not so for more complicated data structures to non java programs. These interfaces were intended as a replacement of screen scraping of Namenode web UI, to be used in scripts etc. using local JMX connection. For such usages, the information is exposed java independent way. The NameNode should expose name dir statuses via JMX Key: HDFS-2978 URL: https://issues.apache.org/jira/browse/HDFS-2978 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-2978.patch We currently display this info on the NN web UI, so users who wish to monitor this must either do it manually or parse HTML. We should publish this information via JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2995) start-dfs.sh tries to start 2NN everywhere
[ https://issues.apache.org/jira/browse/HDFS-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214801#comment-13214801 ] Suresh Srinivas commented on HDFS-2995: --- Could this be related to HDFS-2893? start-dfs.sh tries to start 2NN everywhere -- Key: HDFS-2995 URL: https://issues.apache.org/jira/browse/HDFS-2995 Project: Hadoop HDFS Issue Type: Sub-task Components: scripts Affects Versions: HA branch (HDFS-1623) Reporter: Todd Lipcon When I run start-dfs.sh it tries to start a 2NN on every node in the cluster. This despite: [todd@c1120 hadoop-active]$ ./bin/hdfs getconf -secondaryNameNodes Incorrect configuration: secondary namenode address dfs.namenode.secondary.http-address is not configured. Thankfully they do not start :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2998) OfflineImageViewer and ImageVisitor should be annotated public
[ https://issues.apache.org/jira/browse/HDFS-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214830#comment-13214830 ] Suresh Srinivas commented on HDFS-2998: --- bq. It's intended for subclassing, so it should be annotated as the public API that it is. Why are they public API? When we added these classes the intent was to use it with in HDFS and not make is available publicly. OfflineImageViewer and ImageVisitor should be annotated public -- Key: HDFS-2998 URL: https://issues.apache.org/jira/browse/HDFS-2998 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.23.1 Reporter: Aaron T. Myers The OfflineImageViewer is currently annotated as InterfaceAudience.Private. It's intended for subclassing, so it should be annotated as the public API that it is. The ImageVisitor class should similarly be annotated public (evolving is fine). Note that it should also be changed to be public; it's currently package-private, which means that users have to cheat with their subclass package name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2998) OfflineImageViewer and ImageVisitor should be annotated public
[ https://issues.apache.org/jira/browse/HDFS-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214885#comment-13214885 ] Suresh Srinivas commented on HDFS-2998: --- We could consider it making public. We should make all the other classes not marked public, but are referenced by OIV public also. OfflineImageViewer and ImageVisitor should be annotated public -- Key: HDFS-2998 URL: https://issues.apache.org/jira/browse/HDFS-2998 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.23.1 Reporter: Aaron T. Myers The OfflineImageViewer is currently annotated as InterfaceAudience.Private. It's intended for subclassing, so it should be annotated as the public API that it is. The ImageVisitor class should similarly be annotated public (evolving is fine). Note that it should also be changed to be public; it's currently package-private, which means that users have to cheat with their subclass package name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3002) TestNameNodeMetrics need not wait for metrics update with new metrics framework
[ https://issues.apache.org/jira/browse/HDFS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214890#comment-13214890 ] Suresh Srinivas commented on HDFS-3002: --- In that jira, we are still adding wait time. I agree that our wait times for an even to happen could make the tests flaky. The main point of this jira is to not wait for updation of metrics. Perhaps I can add this comment to that jira and close this. TestNameNodeMetrics need not wait for metrics update with new metrics framework --- Key: HDFS-3002 URL: https://issues.apache.org/jira/browse/HDFS-3002 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 0.23.0, 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Trivial Attachments: HDFS-3002.patch With older metrics framework, the namenode metrics was updated by replication thread. This required test having to wait for replication interval. This is no longer necessary with metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214891#comment-13214891 ] Suresh Srinivas commented on HDFS-2966: --- Steve, please see HDFS-3002 - comment https://issues.apache.org/jira/browse/HDFS-3002?focusedCommentId=13214890page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13214890 TestNameNodeMetrics tests can fail under load - Key: HDFS-2966 URL: https://issues.apache.org/jira/browse/HDFS-2966 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.24.0 Environment: OS/X running intellij IDEA, firefox, winxp in a virtualbox. Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.24.0, 0.23.2 Attachments: HDFS-2966.patch, HDFS-2966.patch I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of running the HDFS tests on a desktop with out enough memory for all the programs trying to run. Things got swapped out and the tests failed as the DN heartbeats didn't come in on time. the tests both rely on {{waitForDeletion()}} to block the tests until the delete operation has completed, but all it does is sleep for the same number of seconds as there are datanodes. This is too brittle -it may work on a lightly-loaded system, but not on a system under heavy load where it is taking longer to replicate than expect. Immediate fix: double, triple, the sleep time? Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3002) TestNameNodeMetrics need not wait for metrics update with new metrics framework
[ https://issues.apache.org/jira/browse/HDFS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214894#comment-13214894 ] Suresh Srinivas commented on HDFS-3002: --- After looking at the patch on HDFS-2966, these are unrelated issues. I am removing updateMetrics() method calls. TestNameNodeMetrics need not wait for metrics update with new metrics framework --- Key: HDFS-3002 URL: https://issues.apache.org/jira/browse/HDFS-3002 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 0.23.0, 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Trivial Attachments: HDFS-3002.patch With older metrics framework, the namenode metrics was updated by replication thread. This required test having to wait for replication interval. This is no longer necessary with metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2987) DNA_SHUTDOWN command is never sent by the NN
[ https://issues.apache.org/jira/browse/HDFS-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214930#comment-13214930 ] Suresh Srinivas commented on HDFS-2987: --- I did consider removing it earlier but decided to leave it alone. It might come handy in the future, even though it is not used currently. DNA_SHUTDOWN command is never sent by the NN Key: HDFS-2987 URL: https://issues.apache.org/jira/browse/HDFS-2987 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Priority: Trivial Labels: newbie The DataNode has a code path to handle a DNA_SHUTDOWN command, but in fact this command has never been sent by the NN (it was introduced by HADOOP-641 in 0.8.0!) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2978) The NameNode should expose name dir statuses as metrics
[ https://issues.apache.org/jira/browse/HDFS-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214933#comment-13214933 ] Suresh Srinivas commented on HDFS-2978: --- One of the things that was done in 0.23 (available in 1.0) was to add every thing exposed in Namenode webUI to the JMX interfaces on the namenode. This interface is accessible over http. I believe this information was added to NN WebUI, post that. We should add it to JMX interfaces. One thing I notice with some of these Jiras is, Aaron you keep saying metrics. Metrics to me is an interface where keep track of peg counts, rate etc, for measurement purposes. The information such as dirs etc to me belongs to management interface and that is where JMX interfaces come in. So we should add this to the JMX interface. Also we need to enforce adding any info added to NN web UI to the JMX interface. The NameNode should expose name dir statuses as metrics --- Key: HDFS-2978 URL: https://issues.apache.org/jira/browse/HDFS-2978 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers We currently display this info on the NN web UI, so users who wish to monitor this must either do it manually or parse HTML. We should publish this information as proper metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215048#comment-13215048 ] Suresh Srinivas commented on HDFS-3005: --- I added that comment when doing Federation work. The problem was with the existing code. I wanted to cleanup that part of the code and did not get to it. Now that it is an issue, lets fix it :-) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2971) some improvements to the manual NN metadata recovery tools
[ https://issues.apache.org/jira/browse/HDFS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215057#comment-13215057 ] Suresh Srinivas commented on HDFS-2971: --- bq. programmer using a hex editor, which is the current situation. OfflineEditsViewer was done for this reason. It could convert edits to human readable file. You can edit it and convert it back to editlog. It is not perfect though. Other idea we had considered was, ignoring the last corrupt edit (most cases corruption), if you start namenode with a flag. some improvements to the manual NN metadata recovery tools -- Key: HDFS-2971 URL: https://issues.apache.org/jira/browse/HDFS-2971 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Affects Versions: 1.1.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: 2012-02-17_0001-OEV-enhancements-pt.-2.patch, HDFS-2971__print_highest_generation_stamp.txt Some improvements to the manual NN metadata recovery tools. Specifically, we want the Offline Edit Viewer (oev) tool to prints out the highest generation stamp that was encountered when processing the edit log. We also want OEV to look for large gaps in the generation stamp, as these can indicate corruption. The minimum gap to look for should be configurable with -G or --genStampGap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2987) DNA_SHUTDOWN command is never sent by the NN
[ https://issues.apache.org/jira/browse/HDFS-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215059#comment-13215059 ] Suresh Srinivas commented on HDFS-2987: --- It is fairly simple code and should work. Datanode calls shutdown(), on receiving this command, which is how DN is shutdown in many places. DNA_SHUTDOWN command is never sent by the NN Key: HDFS-2987 URL: https://issues.apache.org/jira/browse/HDFS-2987 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Priority: Trivial Labels: newbie The DataNode has a code path to handle a DNA_SHUTDOWN command, but in fact this command has never been sent by the NN (it was introduced by HADOOP-641 in 0.8.0!) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3002) TestNameNodeMetrics need not wait for metrics update with new metrics framework
[ https://issues.apache.org/jira/browse/HDFS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215061#comment-13215061 ] Suresh Srinivas commented on HDFS-3002: --- The Thread.sleep() part and waitForDeletion() is what HDFS-2966 is mainly concerned about. However, metrics is being used for detecting the completion of an activity it is waiting for completion. I can rebase HDFS-2966 after committing this patch. TestNameNodeMetrics need not wait for metrics update with new metrics framework --- Key: HDFS-3002 URL: https://issues.apache.org/jira/browse/HDFS-3002 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 0.23.0, 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Trivial Attachments: HDFS-3002.patch With older metrics framework, the namenode metrics was updated by replication thread. This required test having to wait for replication interval. This is no longer necessary with metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2971) some improvements to the manual NN metadata recovery tools
[ https://issues.apache.org/jira/browse/HDFS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13213870#comment-13213870 ] Suresh Srinivas commented on HDFS-2971: --- Sorry, I still have hard time understanding how this is useful. How does printing the highest generation stamp is of use? Could you please describe what problem this solves. Large gaps in generation stamp should not happen. But even detecting that, minimum gap to look for etc., does not seem useful. Do you mean transaction ID instead of generation stamp? some improvements to the manual NN metadata recovery tools -- Key: HDFS-2971 URL: https://issues.apache.org/jira/browse/HDFS-2971 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: 2012-02-17_0001-OEV-enhancements-pt.-2.patch, HDFS-2971__print_highest_generation_stamp.txt Some improvements to the manual NN metadata recovery tools. Specifically, we want the Offline Edit Viewer (oev) tool to prints out the highest generation stamp that was encountered when processing the edit log. We also want OEV to look for large gaps in the generation stamp, as these can indicate corruption. The minimum gap to look for should be configurable with -G or --genStampGap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2971) some improvements to the manual NN metadata recovery tools
[ https://issues.apache.org/jira/browse/HDFS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13213943#comment-13213943 ] Suresh Srinivas commented on HDFS-2971: --- bq. The highest gen stamp in an image or log tells you if it's older than another file system. Do you mean to say, when you have multiple images/edits, the one with the highest gen stamp is the latest? Shouldn't transaction ID introduced in HDFS-1073 be used for this purpose? Or are you planning do this in an older release, if so updating the target version would help. bq. You don't want to update to saved image/logs that are old as that can cause data loss (you start the NN and it starts removing blocks with a new GS). Images are never updated. For editlog, assuming it is what I said previously, loading older image + older editlog pair would result in system coming up with incomplete data and hence loss of data is expected. So the tools that are proposed, is this some thing that an operator would use to determine the latest edits/image. I fail to understand who and how this will be used. some improvements to the manual NN metadata recovery tools -- Key: HDFS-2971 URL: https://issues.apache.org/jira/browse/HDFS-2971 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: 2012-02-17_0001-OEV-enhancements-pt.-2.patch, HDFS-2971__print_highest_generation_stamp.txt Some improvements to the manual NN metadata recovery tools. Specifically, we want the Offline Edit Viewer (oev) tool to prints out the highest generation stamp that was encountered when processing the edit log. We also want OEV to look for large gaps in the generation stamp, as these can indicate corruption. The minimum gap to look for should be configurable with -G or --genStampGap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2978) The NameNode should expose name dir statuses as metrics
[ https://issues.apache.org/jira/browse/HDFS-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212680#comment-13212680 ] Suresh Srinivas commented on HDFS-2978: --- There is already JMX and HTTP based access to JMX interfaces available for getting this information. Most of the things in NN web UI is available through these interfaces. The NameNode should expose name dir statuses as metrics --- Key: HDFS-2978 URL: https://issues.apache.org/jira/browse/HDFS-2978 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers We currently display this info on the NN web UI, so users who wish to monitor this must either do it manually or parse HTML. We should publish this information as proper metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2907) Make FSDataset in Datanode Pluggable
[ https://issues.apache.org/jira/browse/HDFS-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13213082#comment-13213082 ] Suresh Srinivas commented on HDFS-2907: --- Comments: # DFSConfigKeys.java - move the newly added key to the section with Keys with no defaults # Some lines in Datanode.java more than 80 columns # I would have preferred to have a method initStorage(factory) and move the code from initBlockPool(). # FSDatasetInterface.java - remove unnecesarry cast to Factory # SimulatedFSDataset.setStimulatedFSDataset - method namenode could be setFactory? # Since you are changing SimlulatedFSDataset.java - can you please change getBlockReport() where after using map, we check for map != null. # Do not throw IOException from TestSimulatedFSDataset#getSimulatedFSDataset() Make FSDataset in Datanode Pluggable Key: HDFS-2907 URL: https://issues.apache.org/jira/browse/HDFS-2907 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Attachments: h2907_20120216.patch, h2907_20120217.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2971) some improvements to the manual NN metadata recovery tools
[ https://issues.apache.org/jira/browse/HDFS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13213362#comment-13213362 ] Suresh Srinivas commented on HDFS-2971: --- Colin, can you please add little more description to what the enhancements are. some improvements to the manual NN metadata recovery tools -- Key: HDFS-2971 URL: https://issues.apache.org/jira/browse/HDFS-2971 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: 2012-02-17_0001-OEV-enhancements-pt.-2.patch, HDFS-2971__print_highest_generation_stamp.txt some improvements to the manual NN metadata recovery tools -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2955) HA: IllegalStateException during standby startup in getCurSegmentTxId
[ https://issues.apache.org/jira/browse/HDFS-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209555#comment-13209555 ] Suresh Srinivas commented on HDFS-2955: --- @aaron - can you please add the test - which should have been part of HDFS-2943. I believe this patch is good to go without tests, because it is a simple change. Should we open another jira related to 2943 or perhaps reopen 2943. HA: IllegalStateException during standby startup in getCurSegmentTxId - Key: HDFS-2955 URL: https://issues.apache.org/jira/browse/HDFS-2955 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: HA branch (HDFS-1623) Reporter: Hari Mankude Assignee: Hari Mankude Attachments: HDFS-2955-HDFS-1623.patch, HDFS-2955-HDFS-1623.patch During standby restarts, a new routine getTransactionsSinceLastLogRoll() has been introduced for metrics which is calling getCurSegmentTxId(). checkstate() in getCurSegmentTxId() assumes that log is opened for writing and this is not the case in standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2938) Recursive delete of a large directory makes namenode unresponsive
[ https://issues.apache.org/jira/browse/HDFS-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208827#comment-13208827 ] Suresh Srinivas commented on HDFS-2938: --- Thanks for taking care of the nits - from the original code. +1 for the patch. Recursive delete of a large directory makes namenode unresponsive - Key: HDFS-2938 URL: https://issues.apache.org/jira/browse/HDFS-2938 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Suresh Srinivas Assignee: Hari Mankude Attachments: HDFS-2938.patch, HDFS-2938.patch When deleting a large directory with millions of files, namenode holding FSNamesystem lock will make it unresponsive for other request. In this scenario HDFS-173 added a mechanism to delete blocks in smaller chunks holding the locks. With new read/write lock changes, the mechanism from HDFS-173 is lost. Need to resurrect the mechanism back. Also a good unit test/update to existing unit test is needed to catch future errors with this functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2937) HA: TestDFSHAAdmin needs tests with MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207967#comment-13207967 ] Suresh Srinivas commented on HDFS-2937: --- Brandon, Mockito tests are testing client side validation for parameters already. It would be good to only include the the end to end testing with real cluster. HA: TestDFSHAAdmin needs tests with MiniDFSCluster -- Key: HDFS-2937 URL: https://issues.apache.org/jira/browse/HDFS-2937 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Suresh Srinivas Assignee: Brandon Li Attachments: HDFS-2937.HDFS-1623.patch TestDFSHAAdmin currently works with Mockito base HAServiceProtocol. Tests are needed with real namenodes using MiniDFSCluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.
[ https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207463#comment-13207463 ] Suresh Srinivas commented on HDFS-2815: --- Changes only required for this issue, though it may require quite a bit of hdfs-173. Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed. -- Key: HDFS-2815 URL: https://issues.apache.org/jira/browse/HDFS-2815 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Critical Fix For: 0.24.0, 0.23.2 Attachments: HDFS-2815.patch, HDFS-2815.patch When tested the HA(internal) with continuous switch with some 5mins gap, found some *blocks missed* and namenode went into safemode after next switch. After the analysis, i found that this files already deleted by clients. But i don't see any delete commands logs namenode log files. But namenode added that blocks to invalidateSets and DNs deleted the blocks. When restart of the namenode, it went into safemode and expecting some more blocks to come out of safemode. Here the reason could be that, file has been deleted in memory and added into invalidates after this it is trying to sync the edits into editlog file. By that time NN asked DNs to delete that blocks. Now namenode shuts down before persisting to editlogs.( log behind) Due to this reason, we may not get the INFO logs about delete, and when we restart the Namenode (in my scenario it is again switch), Namenode expects this deleted blocks also, as delete request is not persisted into editlog before. I reproduced this scenario with bedug points. *I feel, We should not add the blocks to invalidates before persisting into Editlog*. Note: for switch, we used kill -9 (force kill) I am currently in 0.20.2 version. Same verified in 0.23 as well in normal crash + restart scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.
[ https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206257#comment-13206257 ] Suresh Srinivas commented on HDFS-2815: --- bq. Linking HDFS-173, the patch that added the problematic code. HDFS-173 is not the cause. Before HDFS-173, the following was the sequence: # Delete directory, files and blocks holding the lock. This could trigger the deletion of blocks at the datanodes # Then add editlog entry outside the lock As this jira discussion demonstrates, between the above steps, if NN crashes, there is possibility of block deletion on DNs. However no record of deletion exists in editlog. With HDFS-173, the behavior changed to: # Delete directory, files and blocks holding the lock. This could trigger the deletion of blocks * if number of blocks is small * at the datanodes # Then add editlog entry outside the lock. # * New change to * delete the blocks if the number of blocks is large. Note the part that Uma is talking about is from the step 1. Still the old behavior. The patch is now proposing deletion of blocks post recording it in editlog - from step 3 of HDFS-173. I think this sounds fine. Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed. -- Key: HDFS-2815 URL: https://issues.apache.org/jira/browse/HDFS-2815 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Critical Attachments: HDFS-2815.patch When tested the HA(internal) with continuous switch with some 5mins gap, found some *blocks missed* and namenode went into safemode after next switch. After the analysis, i found that this files already deleted by clients. But i don't see any delete commands logs namenode log files. But namenode added that blocks to invalidateSets and DNs deleted the blocks. When restart of the namenode, it went into safemode and expecting some more blocks to come out of safemode. Here the reason could be that, file has been deleted in memory and added into invalidates after this it is trying to sync the edits into editlog file. By that time NN asked DNs to delete that blocks. Now namenode shuts down before persisting to editlogs.( log behind) Due to this reason, we may not get the INFO logs about delete, and when we restart the Namenode (in my scenario it is again switch), Namenode expects this deleted blocks also, as delete request is not persisted into editlog before. I reproduced this scenario with bedug points. *I feel, We should not add the blocks to invalidates before persisting into Editlog*. Note: for switch, we used kill -9 (force kill) I am currently in 0.20.2 version. Same verified in 0.23 as well in normal crash + restart scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.
[ https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206259#comment-13206259 ] Suresh Srinivas commented on HDFS-2815: --- Uma, +1 for the patch. Please do remove deleteNow var, in the next version of the patch. Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed. -- Key: HDFS-2815 URL: https://issues.apache.org/jira/browse/HDFS-2815 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Critical Attachments: HDFS-2815.patch When tested the HA(internal) with continuous switch with some 5mins gap, found some *blocks missed* and namenode went into safemode after next switch. After the analysis, i found that this files already deleted by clients. But i don't see any delete commands logs namenode log files. But namenode added that blocks to invalidateSets and DNs deleted the blocks. When restart of the namenode, it went into safemode and expecting some more blocks to come out of safemode. Here the reason could be that, file has been deleted in memory and added into invalidates after this it is trying to sync the edits into editlog file. By that time NN asked DNs to delete that blocks. Now namenode shuts down before persisting to editlogs.( log behind) Due to this reason, we may not get the INFO logs about delete, and when we restart the Namenode (in my scenario it is again switch), Namenode expects this deleted blocks also, as delete request is not persisted into editlog before. I reproduced this scenario with bedug points. *I feel, We should not add the blocks to invalidates before persisting into Editlog*. Note: for switch, we used kill -9 (force kill) I am currently in 0.20.2 version. Same verified in 0.23 as well in normal crash + restart scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2586) Add protobuf service and implementation for HAServiceProtocol
[ https://issues.apache.org/jira/browse/HDFS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206341#comment-13206341 ] Suresh Srinivas commented on HDFS-2586: --- Created a related jira HDFS-2937. Add protobuf service and implementation for HAServiceProtocol - Key: HDFS-2586 URL: https://issues.apache.org/jira/browse/HDFS-2586 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: HA branch (HDFS-1623) Attachments: HDFS-2586.txt, HDFS-2586.txt When the trunk moves to protobuf based RPC, HAServiceProtocol should have equivalent protobuf implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2586) Add protobuf service and implementation for HAServiceProtocol
[ https://issues.apache.org/jira/browse/HDFS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205777#comment-13205777 ] Suresh Srinivas commented on HDFS-2586: --- Aaron, if we cannot commit based on unit tests then the process is broken. I do not want to waste time bringing up cluster and manually testing it, if possible. By the way, applying the same standard, the client failover code should not get committed at all. Because we know it does not work with delegation token :-) Add protobuf service and implementation for HAServiceProtocol - Key: HDFS-2586 URL: https://issues.apache.org/jira/browse/HDFS-2586 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: HA branch (HDFS-1623) Attachments: HDFS-2586.txt, HDFS-2586.txt When the trunk moves to protobuf based RPC, HAServiceProtocol should have equivalent protobuf implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2586) Add protobuf service and implementation for HAServiceProtocol
[ https://issues.apache.org/jira/browse/HDFS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205875#comment-13205875 ] Suresh Srinivas commented on HDFS-2586: --- Was that with security enabled or not? Add protobuf service and implementation for HAServiceProtocol - Key: HDFS-2586 URL: https://issues.apache.org/jira/browse/HDFS-2586 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: HA branch (HDFS-1623) Attachments: HDFS-2586.txt, HDFS-2586.txt When the trunk moves to protobuf based RPC, HAServiceProtocol should have equivalent protobuf implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206019#comment-13206019 ] Suresh Srinivas commented on HDFS-2802: --- Jeff, that is the reason why hdfs-233 is marked as related. Can you please stop marking the Jira as resolved! Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Reporter: Hari Mankude Assignee: Hari Mankude Fix For: 0.24.0 Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2887) Define a FSVolume interface
[ https://issues.apache.org/jira/browse/HDFS-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203941#comment-13203941 ] Suresh Srinivas commented on HDFS-2887: --- Change make the code much cleaner. Comments: # Currently there are several places in FSVolumeSet where FSVolumeInterface is cast to FSVolume. FSVolumeSet should continue to take FSVolume in the constructor. That way outside FSDataset the only thing exposed is FSVolumeInterface. Inside FSDataset, we could continue to use FSVolume? # In a separate Jira perhaps we should add to FSDatasetInterface, isDataScanSupported() and isDirectoryScanSupported. Perhaps we could move scanners starting to FSDataset altogether and add a method, startScanners()? # When committing this change, this should be marked incompatible, due to BlockVolumeChoosingPolicy change Define a FSVolume interface --- Key: HDFS-2887 URL: https://issues.apache.org/jira/browse/HDFS-2887 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h2887_20120203.patch, h2887_20120207.patch FSVolume is an inner class in FSDataset. It is actually a part of the implementation of FSDatasetInterface. It is better to define a new interface, namely FSVolumeInterface, to capture the abstraction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2887) Define a FSVolume interface
[ https://issues.apache.org/jira/browse/HDFS-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203953#comment-13203953 ] Suresh Srinivas commented on HDFS-2887: --- +1 for the patch. Define a FSVolume interface --- Key: HDFS-2887 URL: https://issues.apache.org/jira/browse/HDFS-2887 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h2887_20120203.patch, h2887_20120207.patch FSVolume is an inner class in FSDataset. It is actually a part of the implementation of FSDatasetInterface. It is better to define a new interface, namely FSVolumeInterface, to capture the abstraction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2911) Gracefully handle OutOfMemoryErrors
[ https://issues.apache.org/jira/browse/HDFS-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204235#comment-13204235 ] Suresh Srinivas commented on HDFS-2911: --- bq. @Eli ... as Todd points out not all OOMs are unrecoverable ... bq. On the NN I'd rather see the critical threads all get uncaughtExceptionHandlers attached which abort the NN if they fail. So if an individual rpc handler OOMEs (eg by an invalid request making it try to allocate a 4G array or something) it won't take down the NN, whereas if the LeaseManager OOMEs it should. I think this may not be a good idea. Infact I would say, it is more important to shutdown NN when RPC handler gets an OOME. Lets say an RPC handler updated in memory namespace and was about add it to editlog. The system was indeed running out of memory and before editlog could be written the handler got OOME. If we do not shutdown at this time, we could end up in interesting data corruption issues. Instead of trying to categorize which one is safe and not safe, we should use kill -9 option. In cases where OOME is caused by the system trying to create a large object, we could add appropriate size/limit checks. Gracefully handle OutOfMemoryErrors --- Key: HDFS-2911 URL: https://issues.apache.org/jira/browse/HDFS-2911 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 1.0.0 Reporter: Eli Collins Assignee: Eli Collins We should gracefully handle j.l.OutOfMemoryError exceptions in the NN or DN. We should catch them in a high-level handler, cleanly fail the RPC (vs sending back the OOM stackrace) or background thread, and shutdown the NN or DN. Currently the process is left in a not well-test tested state (continuously fails RPCs and internal threads, may or may not recover and doesn't shutdown gracefully). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2922) HA: close out operation categories
[ https://issues.apache.org/jira/browse/HDFS-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204273#comment-13204273 ] Suresh Srinivas commented on HDFS-2922: --- I agree with Todd. We should treat setBalancerBandwidth() same as refreshNodes(). Why do unnecessarily failover for now? HA: close out operation categories -- Key: HDFS-2922 URL: https://issues.apache.org/jira/browse/HDFS-2922 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: HA branch (HDFS-1623) Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-2922.txt We need to close out the NN operations categories. The following operations should be left as is, ie not failover, as it's reasonable to call these on a standby, and we just need to update the TODO with a comment: - {{setSafeMode}} (Might want to force the standby out of safemode) - {{restoreFailedStorage}} (Might want to tell the standby to restore the shared edits dir) - {{saveNamespace}}, {{metaSave}} (Could imagine calling these on a standby eg in a recovery scenario) - {{refreshNodes}} (Decommissioning needs to refresh the standby) The following operations should be checked for READ, as neither should need to be called on standby, will failover unless stale reads are enabled: - {{getTransactionID}}, {{getEditLogManifest}} (we don't checkoint the standby) The following operations should be checked for WRITE, as they should not be called on a standby, ie should always failover: - {{finalizeUpgrade}}, {{distributedUpgradeProgress}} (should not be able to upgrade the standby) - {{setBalancerBandwidth}} (balancer should failover) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira