[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042615#comment-13042615 ] Hadoop QA commented on HDFS-988: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481197/988-fixups.txt against trunk revision 1130381. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/684//console This message is automatically generated. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, > hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, > hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, > saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-988: - Attachment: 988-fixups.txt Attaching a few fixups on top of hdfs-988-5.patch, related to the below comments: Regarding the question about computeDatanodeWork/heartbeatCheck: computeDatanodeWork calls blockManager.computeReplicationWork and blockManager.computeInvalidateWork. In the case of computeReplicationWork, it might schedule some replications. This seems OK - worst case we get some extra replicas which will get fixed up later. In the case of computeInvalidateWork, it calls invalidateWorkForOneNode which takes the write lock and then checks safe mode before scheduling any deletions. In heartbeatCheck, I think we can simply put another "if (isInSafeMode()) return" in right after it takes the writeLock if it finds a dead node. That way if it races, it still doesn't take any actions based on it. Either way, I don't think this could corrupt anything since it won't write to the edit log. Some other notes: - isLockedReadOrWrite should be checking this.fsLock.getReadHoldCount() rather than getReadLockCount() - FSDirectory#bLock says it protects the block map, but it also protects the directory, right? we should update the comment and perhaps the name. - various functions don't take the read lock because they call functions in FSDirectory that take FSDirectory.bLock. This seems incorrect, since, for example, getListing() racing against open() with overwrite=true could return the directory with the file deleted but the new one not there yet. I guess what's confusing me is that it's not clear why some functions don't need readLock when they perform read operations. When is just the FSDirectory lock sufficient? It looks like a lot of the test failures above are due to this. - handleHeartbeat calls getDatanode() while only holding locks on heartbeats and datanodeMap, but registerDatanode mutates datanodeMap without locking either. - getDataNodeInfo seems like an unused function with no locking - can we remove it? - several other places access datanodeMap with synchronization on that object itself. unprotectedAddDatanode should assert it holds that monitor lock - when loading the edit log, why doesn't loadFSEdits take a write lock on the namesystem before it starts? then we could add all of the asserts and not worry about it. - it looks like saving the image no longer works, since saveFilesUnderConstruction now takes the readLock, but it's being called by a different thread than took the write lock in saveNamespace. So, it deadlocks. At first I thought this could be solved by just making saveNamespace take a read lock instead of write lock, but that actually doesn't work due to fairness -- what can happen is that saveNamespace takes readLock, then some other thread comes along and queues up for the write lock. At the point, no further readers are allowed to take the read lock, because it's a fair lock. So, the image-writer thread locks up. Optimizations to address later: - When create() is called with the overwrite flag true, that calls delete() which will logSync() while holding the lock. We can hold off on fixing it since it's a performance problem, not correctness, and the operation is fairly rare. - getAdditionalBlock doesn't logSync() - I think there's another issue pending about that since it will affect HA. Let's address later. - checkFileProgress doesn't really need the write lock - seems like saveNamespace could safely just take the read lock to allow other readers to keep working Nits: - Typo: "Cnnot concat" - rollEditLog has comment saying "Checkpoint not created" - rollFSImage has the same issue, but at least has to do with checkpoints, so could be correct > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, > hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, > hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, > saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I ha
[jira] [Commented] (HDFS-2023) Backport of NPE for File.list and File.listFiles
[ https://issues.apache.org/jira/browse/HDFS-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042604#comment-13042604 ] Eli Collins commented on HDFS-2023: --- I don't feel strongly. It's easier for users if the same issue is represented by a single jira across versions (you'll see a patch for different branches on the same jira) but if the content is different (not the patch but different goal/change) then a new jira makes sense. > Backport of NPE for File.list and File.listFiles > > > Key: HDFS-2023 > URL: https://issues.apache.org/jira/browse/HDFS-2023 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.20.205.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.20.205.0 > > Attachments: HDFS-2023-1.patch > > > Since we have multiple Jira's in trunk for common and hdfs, I am creating > another jira for this issue. > This patch addresses the following: > 1. Provides FileUtil API for list and listFiles which throws IOException for > null cases. > 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042600#comment-13042600 ] Hadoop QA commented on HDFS-988: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481173/hdfs-988-5.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens org.apache.hadoop.hdfs.server.namenode.TestCheckpoint org.apache.hadoop.hdfs.server.namenode.TestEditLogRace org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.TestDFSFinalize org.apache.hadoop.hdfs.TestDFSRollback org.apache.hadoop.hdfs.TestDFSStartupVersions org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestDFSUpgrade org.apache.hadoop.hdfs.TestListFilesInDFS org.apache.hadoop.hdfs.TestListFilesInFileContext org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//console This message is automatically generated. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-b22-1.patch, > hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read
[ https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CW Chung updated HDFS-1968: --- Attachment: TestWriteRead.patch Only one patch is allowed. The formating part was taken care by HDFS-2024. This patch contains material changes only. > Enhance TestWriteRead to support File Append and Position Read > --- > > Key: HDFS-1968 > URL: https://issues.apache.org/jira/browse/HDFS-1968 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 0.23.0 >Reporter: CW Chung >Assignee: CW Chung >Priority: Minor > Attachments: TestWriteRead-1-Format.patch, > TestWriteRead-2-Append.patch, TestWriteRead.patch, TestWriteRead.patch, > TestWriteRead.patch, TestWriteRead.patch > > > Desirable to enhance TestWriteRead to support command line options to do: > (1) File Append > (2) Position Read (currently supporting sequential read). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042591#comment-13042591 ] Hadoop QA commented on HDFS-988: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481191/hdfs-988-b22-1.patch against trunk revision 1130381. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/683//console This message is automatically generated. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-b22-1.patch, > hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-988: - Attachment: hdfs-988-b22-1.patch Minimal patch for branch 22 with tests attached. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-b22-1.patch, > hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042578#comment-13042578 ] Todd Lipcon commented on HDFS-1149: --- A few nits: - for DataNode.setHeartbeatsEnabled, I think it would be better to make it package-private, and then bounce through the "DataNodeAdapter" class to get at it. I also think it would be clearer if we inverted its meaning and renamed it to {{heartbeatsDisabledForTests}} - that way when reading the code later it will be clear that this is always false in normal operation. - Same goes for all of the new public members in LeaseManager/Lease -- I think you can just move the getLeaseByPath function into NameNodeAdapter, then it can all stay package-protected, right? - In the test case, I think it's better to call {{stm.hflush()}} after the writer has lost its lease -- this is a DN-only operation, which means that it's verifying that the lease recovery has gone all the way through, not just a NN state change. The fact that you check isUnderConstruction should already do that as well, but just a double-check. Then you can close the stream as well and check for the same exception. - I think the new NAMENODE_LEASE_MANAGER_SLEEP_TIME is probably better named NAMENODE_LEASE_RECHECK_INTERVAL (more consistent with other variables like {{heartbeatRecheckInterval}} and {{replicationRecheckInterval}}) Other concern: - Does this interact correctly with lease maintenance on rename/delete? I think so... but it would be good to add the following tests: Test A: 1) client creates file /dir_a/file and leaves it open 2) client renames /dir_a to /dir_b (this calls LeaseManager.changeLease) 3) client dies, so lease recovery happens 4) NN reassigns lease to NN_Recovery 5) NN restarts and loads edits: NN_Recovery should own the lease on the new location of the file [ this tests that on edit log replay, the lease is properly tracked to the new name of the file ] Test B: 1) client creates file /file and leaves it open 2) client deletes file /file 3) client dies, so lease recovery happens 4) NN reassigns lease to NN_Recovery 5) NN restarts and loads edits: no NPEs or anything I'm also wondering if we have an issue with regards to safeMode. In theory we should never write anything to the edit log while in safemode, but I don't see safemode checks in internalReleaseLease. This is similar to the bugs seen in HDFS-988 if you want some background > Lease reassignment is not persisted to edit log > --- > > Key: HDFS-1149 > URL: https://issues.apache.org/jira/browse/HDFS-1149 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.22.0, 0.23.0 >Reporter: Todd Lipcon >Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1149.0.patch > > > During lease recovery, the lease gets reassigned to a special NN holder. This > is not currently persisted to the edit log, which means that after an NN > restart, the original leaseholder could end up allocating more blocks or > completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2014) RPM packages broke bin/hdfs script
[ https://issues.apache.org/jira/browse/HDFS-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042577#comment-13042577 ] Hadoop QA commented on HDFS-2014: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481182/HDFS-2014-1.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/682//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/682//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/682//console This message is automatically generated. > RPM packages broke bin/hdfs script > -- > > Key: HDFS-2014 > URL: https://issues.apache.org/jira/browse/HDFS-2014 > Project: Hadoop HDFS > Issue Type: Bug > Components: scripts >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Eric Yang >Priority: Critical > Fix For: 0.23.0 > > Attachments: HDFS-2014-1.patch, HDFS-2014.patch > > > bin/hdfs now appears to depend on ../libexec, which doesn't exist inside of a > source checkout: > todd@todd-w510:~/git/hadoop-hdfs$ ./bin/hdfs namenode > ./bin/hdfs: line 22: > /home/todd/git/hadoop-hdfs/bin/../libexec/hdfs-config.sh: No such file or > directory > ./bin/hdfs: line 138: cygpath: command not found > ./bin/hdfs: line 161: exec: : not found -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
[ https://issues.apache.org/jira/browse/HDFS-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2024: - Resolution: Fixed Fix Version/s: 0.23.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, CW! > Eclipse format HDFS Junit test hdfs/TestWriteRead.java > --- > > Key: HDFS-2024 > URL: https://issues.apache.org/jira/browse/HDFS-2024 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: CW Chung >Assignee: CW Chung >Priority: Trivial > Fix For: 0.23.0 > > Attachments: TestWriteRead-2024.patch > > > Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in > preparation of HDFS-1968. > So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
[ https://issues.apache.org/jira/browse/HDFS-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042574#comment-13042574 ] Hadoop QA commented on HDFS-2024: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481175/TestWriteRead-2024.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/681//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/681//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/681//console This message is automatically generated. > Eclipse format HDFS Junit test hdfs/TestWriteRead.java > --- > > Key: HDFS-2024 > URL: https://issues.apache.org/jira/browse/HDFS-2024 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: CW Chung >Assignee: CW Chung >Priority: Trivial > Attachments: TestWriteRead-2024.patch > > > Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in > preparation of HDFS-1968. > So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042572#comment-13042572 ] Hadoop QA commented on HDFS-1149: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481177/hdfs-1149.0.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestHDFSTrash org.apache.hadoop.hdfs.TestHFlush org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//console This message is automatically generated. > Lease reassignment is not persisted to edit log > --- > > Key: HDFS-1149 > URL: https://issues.apache.org/jira/browse/HDFS-1149 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.22.0, 0.23.0 >Reporter: Todd Lipcon >Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1149.0.patch > > > During lease recovery, the lease gets reassigned to a special NN holder. This > is not currently persisted to the edit log, which means that after an NN > restart, the original leaseholder could end up allocating more blocks or > completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1923) Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29
[ https://issues.apache.org/jira/browse/HDFS-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042567#comment-13042567 ] Tsz Wo (Nicholas), SZE commented on HDFS-1923: -- Todd, so do you think the patch is good? > Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29 > > > Key: HDFS-1923 > URL: https://issues.apache.org/jira/browse/HDFS-1923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Matt Foley >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h1923_20110527.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
[ https://issues.apache.org/jira/browse/HDFS-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2024: - Hadoop Flags: [Reviewed] +1 patch looks good. > Eclipse format HDFS Junit test hdfs/TestWriteRead.java > --- > > Key: HDFS-2024 > URL: https://issues.apache.org/jira/browse/HDFS-2024 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: CW Chung >Assignee: CW Chung >Priority: Trivial > Attachments: TestWriteRead-2024.patch > > > Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in > preparation of HDFS-1968. > So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
[ https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1966: - Resolution: Fixed Fix Version/s: 0.23.0 Release Note: Added header classes for individual DataTransferProtocol op headers. Hadoop Flags: [Incompatible change, Reviewed] Status: Resolved (was: Patch Available) I have committed this. > Encapsulate individual DataTransferProtocol op header > - > > Key: HDFS-1966 > URL: https://issues.apache.org/jira/browse/HDFS-1966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.23.0 > > Attachments: h1966_20110519.patch, h1966_20110524.patch, > h1966_20110526.patch, h1966_20110527b.patch > > > It will make a clear distinction between the variables used in the protocol > and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-2020: -- Resolution: Fixed Fix Version/s: 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) After looping for 15 minutes I saw no failures, where I could get it to fail regularly without the patch. Committed to trunk. Thanks, Suresh! > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Fix For: 0.23.0 > > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
[ https://issues.apache.org/jira/browse/HDFS-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2024: - Status: Patch Available (was: Open) > Eclipse format HDFS Junit test hdfs/TestWriteRead.java > --- > > Key: HDFS-2024 > URL: https://issues.apache.org/jira/browse/HDFS-2024 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: CW Chung >Assignee: CW Chung >Priority: Trivial > Attachments: TestWriteRead-2024.patch > > > Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in > preparation of HDFS-1968. > So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2014) RPM packages broke bin/hdfs script
[ https://issues.apache.org/jira/browse/HDFS-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HDFS-2014: Attachment: HDFS-2014-1.patch Restore to HADOOP_HDFS_HOME for developer. > RPM packages broke bin/hdfs script > -- > > Key: HDFS-2014 > URL: https://issues.apache.org/jira/browse/HDFS-2014 > Project: Hadoop HDFS > Issue Type: Bug > Components: scripts >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Eric Yang >Priority: Critical > Fix For: 0.23.0 > > Attachments: HDFS-2014-1.patch, HDFS-2014.patch > > > bin/hdfs now appears to depend on ../libexec, which doesn't exist inside of a > source checkout: > todd@todd-w510:~/git/hadoop-hdfs$ ./bin/hdfs namenode > ./bin/hdfs: line 22: > /home/todd/git/hadoop-hdfs/bin/../libexec/hdfs-config.sh: No such file or > directory > ./bin/hdfs: line 138: cygpath: command not found > ./bin/hdfs: line 161: exec: : not found -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042539#comment-13042539 ] Todd Lipcon commented on HDFS-2020: --- patch looks pretty good. Let's see what Hudson thinks. > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042541#comment-13042541 ] Todd Lipcon commented on HDFS-2020: --- oh.. I was looking at an old tab where Hudson hadn't commented yet :) Hudson says +1, so I agree. Let me loop the test that was failing for a few minutes, then we'll commit if it all looks good. > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: HDFS-1995.3.patch Ran test-patch and find bug caught one warning. Removing an unread field: org.apache.hadoop.hdfs.server.namenode.ClusterJspHelper$NamenodeStatus.clusterDfsUsed > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.3.patch, HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1149: - Affects Version/s: 0.23.0 0.22.0 Fix Version/s: 0.23.0 > Lease reassignment is not persisted to edit log > --- > > Key: HDFS-1149 > URL: https://issues.apache.org/jira/browse/HDFS-1149 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.22.0, 0.23.0 >Reporter: Todd Lipcon >Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1149.0.patch > > > During lease recovery, the lease gets reassigned to a special NN holder. This > is not currently persisted to the edit log, which means that after an NN > restart, the original leaseholder could end up allocating more blocks or > completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1149: - Status: Patch Available (was: Open) > Lease reassignment is not persisted to edit log > --- > > Key: HDFS-1149 > URL: https://issues.apache.org/jira/browse/HDFS-1149 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.22.0, 0.23.0 >Reporter: Todd Lipcon >Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1149.0.patch > > > During lease recovery, the lease gets reassigned to a special NN holder. This > is not currently persisted to the edit log, which means that after an NN > restart, the original leaseholder could end up allocating more blocks or > completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1149: - Attachment: hdfs-1149.0.patch Patch which addresses the issue. I changed around the {{waitActive}} method of {{MiniDFSCluster}} such that it will work both on fresh NN starts and NN restarts. This consisted of moving some error handling code around the call to {{waitActive}} from {{restartNameNode}} into {{waitActive}} itself. > Lease reassignment is not persisted to edit log > --- > > Key: HDFS-1149 > URL: https://issues.apache.org/jira/browse/HDFS-1149 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.22.0, 0.23.0 >Reporter: Todd Lipcon >Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1149.0.patch > > > During lease recovery, the lease gets reassigned to a special NN holder. This > is not currently persisted to the edit log, which means that after an NN > restart, the original leaseholder could end up allocating more blocks or > completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2023) Backport of NPE for File.list and File.listFiles
[ https://issues.apache.org/jira/browse/HDFS-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042531#comment-13042531 ] Matt Foley commented on HDFS-2023: -- Eli, I asked Bharath to make a separate Jira, because this set of changes isn't the same content as the previously existing Jiras. Granted he could split this into the same four chunks as represented by HADOOP-7342, HADOOP-7322, HDFS-1934, and HDFS-2019. But it seemed more efficient to do them together for v20, since there is no HADOOP/HDFS split. Do you prefer to have four patches instead of one? > Backport of NPE for File.list and File.listFiles > > > Key: HDFS-2023 > URL: https://issues.apache.org/jira/browse/HDFS-2023 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.20.205.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.20.205.0 > > Attachments: HDFS-2023-1.patch > > > Since we have multiple Jira's in trunk for common and hdfs, I am creating > another jira for this issue. > This patch addresses the following: > 1. Provides FileUtil API for list and listFiles which throws IOException for > null cases. > 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
[ https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042530#comment-13042530 ] Jitendra Nath Pandey commented on HDFS-1966: I think DataTransferProtocol is getting too cluttered and it might be worthwhile to split it into several classes and interfaces. But that is beyond the scope of this jira. +1 for the patch. > Encapsulate individual DataTransferProtocol op header > - > > Key: HDFS-1966 > URL: https://issues.apache.org/jira/browse/HDFS-1966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h1966_20110519.patch, h1966_20110524.patch, > h1966_20110526.patch, h1966_20110527b.patch > > > It will make a clear distinction between the variables used in the protocol > and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
[ https://issues.apache.org/jira/browse/HDFS-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CW Chung updated HDFS-2024: --- Attachment: TestWriteRead-2024.patch This patch contain just formatting change. No material change here. > Eclipse format HDFS Junit test hdfs/TestWriteRead.java > --- > > Key: HDFS-2024 > URL: https://issues.apache.org/jira/browse/HDFS-2024 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: CW Chung >Assignee: CW Chung >Priority: Trivial > Attachments: TestWriteRead-2024.patch > > > Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in > preparation of HDFS-1968. > So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042523#comment-13042523 ] Hadoop QA commented on HDFS-2020: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481163/HDFS-2020.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/678//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/678//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/678//console This message is automatically generated. > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-988: - Status: Patch Available (was: Open) > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0, 0.20-append, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988.txt, > saveNamespace.txt, saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-988: - Attachment: hdfs-988-5.patch Thanks for taking a look Todd. Updated patch attached. bq. checks for if (auditLog.isInfoEnabled()) should probably now be (auditLog.isInfoEnabled() && isExternalInvocation()) – otherwise we're doing a needless directory traversal for fsck Fixed. bq. The following methods currently do logSync() while holding the writeLock, which is expensive: Fixed. (Only one needed to conditionally call logSync) bq. seems strange that some of the xInternal() methods take the write lock themselves (eg setReplicationInternal) whereas others assume the caller takes the write lock (eg createSymlinkInternal). We should be consistent. Latest patch makes them more consistent, I also refactored out a couple new xInternal methods. In a couple places (eg deleteInternal and getListing) I didn't hoist up the locking because it would make the locking too coarse-grain (eg would result in syncing the log w/ the lock held). bq. for those methods that don't explicitly take the write lock, we should either add an assert hasWriteLock() or a comment explaining why the lock is not necessary (eg internalReleaseLease, reassignLease, finalizeINodeFileUnderConstruction) Done. For FSDirectory I made the unprotectedX methods actually unprotected and moved the locking to the caller (except for FSEditLogLoader which calls the unprotected methods directly on purpose - I doubt this really saves us that much). These methods (per their name) are now intentionally unprotected. bq. comment for endCheckpoint says "not started" but should say "not ended". same with updatePipeline. Both fixed. bq. why doesn't getListing need the read lock? Because its callees (check*, getListing) take the lock. bq. I noticed that nextGenerationStamp() doesn't logSync() – that seems dangerous, since after a restart we might hand out a duplicate genstamp. Good catch. I made sure all callers sync the log (this was only missing from the updateBlockForPipeline path). nextGenerationStamp is always called with the lock held so I asserted that and removed the lock aquisition from this method. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988.txt, > saveNamespace.txt, saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1934) Fix NullPointerException when File.listFiles() API returns null
[ https://issues.apache.org/jira/browse/HDFS-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated HDFS-1934: - Summary: Fix NullPointerException when File.listFiles() API returns null (was: Fix NullPointerException when certain File APIs return null) > Fix NullPointerException when File.listFiles() API returns null > --- > > Key: HDFS-1934 > URL: https://issues.apache.org/jira/browse/HDFS-1934 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.23.0 > > Attachments: HDFS-1934-1.patch, HDFS-1934-2.patch, HDFS-1934-3.patch, > HDFS-1934-4.patch, HDFS-1934-5.patch > > > While testing Disk Fail Inplace, We encountered the NPE from this part of the > code. > File[] files = dir.listFiles(); > for (File f : files) { > ... > } > This is kinda of an API issue. When a disk is bad (or name is not a > directory), this API (listFiles, list) return null rather than throwing an > exception. This 'for loop' throws a NPE exception. And same applies to > dir.list() API. > Fix all the places where null condition was not checked. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2024) Eclipse format HDFS Junit test hdfs/TestWriteRead.java
Eclipse format HDFS Junit test hdfs/TestWriteRead.java --- Key: HDFS-2024 URL: https://issues.apache.org/jira/browse/HDFS-2024 Project: Hadoop HDFS Issue Type: Improvement Components: test Reporter: CW Chung Assignee: CW Chung Priority: Trivial Eclipse format the file src/test/../hdfs/TestWriteRead.java. This is in preparation of HDFS-1968. So the patch should have only formatting changes such as white space. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2023) Backport of NPE for File.list and File.listFiles
[ https://issues.apache.org/jira/browse/HDFS-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042519#comment-13042519 ] Bharath Mundlapudi commented on HDFS-2023: -- Hi Eli, I wanted to have this change in the same Jira as 0.23 but those were reviewed and committed. So I created this one. Also, i could have done multiple patches in those same Jiras but this will be not good for reviwers. On the positive side, we can have this single Jira for all 0.20.*. But i agree with you on having same Jira for backporting. > Backport of NPE for File.list and File.listFiles > > > Key: HDFS-2023 > URL: https://issues.apache.org/jira/browse/HDFS-2023 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.20.205.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.20.205.0 > > Attachments: HDFS-2023-1.patch > > > Since we have multiple Jira's in trunk for common and hdfs, I am creating > another jira for this issue. > This patch addresses the following: > 1. Provides FileUtil API for list and listFiles which throws IOException for > null cases. > 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
[ https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042513#comment-13042513 ] Tsz Wo (Nicholas), SZE commented on HDFS-1966: -- Jitendra, thanks for the review. > 1. What is the reason for defining header classes inside the Op enum? It is because the headers are operation related. There are other classes like {{PacketHeader}} which is nothing to do with operations. > 2. I will recommend adding a factory to create right header object depending > on the opcode. The factory could be useful at the receiving end. We already have {{DataTransferProtocol.Receiver}}. I think it is the factory you mean. > 3. Please add a few unit tests for serialization/de-serialization of the > headers. We have many tests for read, write, fault-inject tests, balancer, etc. These tests cover {{DataTransferProtocol}}. So adding new tests for the header seems redundant. Do you agree? > Encapsulate individual DataTransferProtocol op header > - > > Key: HDFS-1966 > URL: https://issues.apache.org/jira/browse/HDFS-1966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h1966_20110519.patch, h1966_20110524.patch, > h1966_20110526.patch, h1966_20110527b.patch > > > It will make a clear distinction between the variables used in the protocol > and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: (was: ClusterSummary-2.png) > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: OneNN.png > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: (was: OneNN.png) > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: ClusterSummary-2.png > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2023) Backport of NPE for File.list and File.listFiles
[ https://issues.apache.org/jira/browse/HDFS-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042509#comment-13042509 ] Eli Collins commented on HDFS-2023: --- In the future how about using multiple fix versions on the original jira so we don't have different jira numbers for the same change? Ie we don't have multiple jiras for an issue that goes into both 0.23 and 0.22, so no need for a jira going into 0.23 and 0.20.205. > Backport of NPE for File.list and File.listFiles > > > Key: HDFS-2023 > URL: https://issues.apache.org/jira/browse/HDFS-2023 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.20.205.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.20.205.0 > > Attachments: HDFS-2023-1.patch > > > Since we have multiple Jira's in trunk for common and hdfs, I am creating > another jira for this issue. > This patch addresses the following: > 1. Provides FileUtil API for list and listFiles which throws IOException for > null cases. > 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: (was: HDFS-1995.2.patch) > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: HDFS-1995.2.patch Rename on Cluster Summary page: Remaining => DFS Remaining Remaining% => DFS Remaining% to be consistent with name node UI page > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2021) TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2021: - Resolution: Fixed Fix Version/s: 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) The failure of {{TestDFSUpgradeFromImage}} is not related. Thanks Daryn for reviewing the patches. I have committed this. Thanks, John! > TestWriteRead failed with inconsistent visible length of a file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George > Fix For: 0.23.0 > > Attachments: HDFS-2021-2.patch, HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2023) Backport of NPE for File.list and File.listFiles
[ https://issues.apache.org/jira/browse/HDFS-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Mundlapudi updated HDFS-2023: - Attachment: HDFS-2023-1.patch Attaching a patch for this issue. > Backport of NPE for File.list and File.listFiles > > > Key: HDFS-2023 > URL: https://issues.apache.org/jira/browse/HDFS-2023 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.20.205.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.20.205.0 > > Attachments: HDFS-2023-1.patch > > > Since we have multiple Jira's in trunk for common and hdfs, I am creating > another jira for this issue. > This patch addresses the following: > 1. Provides FileUtil API for list and listFiles which throws IOException for > null cases. > 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2023) Backport of NPE for File.list and File.listFiles
Backport of NPE for File.list and File.listFiles Key: HDFS-2023 URL: https://issues.apache.org/jira/browse/HDFS-2023 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.205.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.20.205.0 Since we have multiple Jira's in trunk for common and hdfs, I am creating another jira for this issue. This patch addresses the following: 1. Provides FileUtil API for list and listFiles which throws IOException for null cases. 2. Replaces most of the code where JDK file API with FileUtil API. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: OneNN.png Upload the screen shot one of the name nodes UI. Name node UI lay out does not change. (Changed the calculation of remaining%.) Capacity, DFS used, DFS remaining... etc. is consistent with Cluster Summary page. > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, > HDFS-1995.patch, OneNN.png > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-2020: -- Attachment: HDFS-2020.patch Early version of the patch - gets rid of static DataNode#datanodeObject. > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2020) TestDFSUpgradeFromImage fails
[ https://issues.apache.org/jira/browse/HDFS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-2020: -- Status: Patch Available (was: Open) > TestDFSUpgradeFromImage fails > - > > Key: HDFS-2020 > URL: https://issues.apache.org/jira/browse/HDFS-2020 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, test >Affects Versions: 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-2020.patch, log.txt > > > Datanode has a singleton datanodeObject. When running MiniDFSCluster with > multiple datanodes, the singleton can point to only one of the datanodes. > TestDFSUpgradeFromImage fails related to initialization of this singleton. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: HDFS-1995.2.patch > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, HDFS-1995.patch > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
[ https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1995: --- Attachment: ClusterSummary-2.png Upload a screen shot after the fixes. > Minor modification to both dfsclusterhealth and dfshealth pages for Web UI > -- > > Key: HDFS-1995 > URL: https://issues.apache.org/jira/browse/HDFS-1995 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: ClusterSummary-2.png, HDFS-1995.2.patch, HDFS-1995.patch > > > Four small modifications/fixes: > on dfshealthpage: > 1) fix remaining% to be remaining / total ( it was mistaken as used / total) > on dfsclusterhealth page: > 1) makes the table header 8em wide > 2) fix the typo(inconsistency) Total Files and Blocks => Total Files and > Directories > 3) make the DFS Used = sum of block pool used space of every name space. And > change the label names accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
[ https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042493#comment-13042493 ] Jitendra Nath Pandey commented on HDFS-1966: 1. What is the reason for defining header classes inside the Op enum? 2. I will recommend adding a factory to create right header object depending on the opcode. The factory could be useful at the receiving end. 3. Please add a few unit tests for serialization/de-serialization of the headers. > Encapsulate individual DataTransferProtocol op header > - > > Key: HDFS-1966 > URL: https://issues.apache.org/jira/browse/HDFS-1966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h1966_20110519.patch, h1966_20110524.patch, > h1966_20110526.patch, h1966_20110527b.patch > > > It will make a clear distinction between the variables used in the protocol > and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1907) BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush
[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042491#comment-13042491 ] Hadoop QA commented on HDFS-1907: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481145/HDFS-1907-2.patch against trunk revision 1130262. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/677//console This message is automatically generated. > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-1907-2.patch, HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*1; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read
[ https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CW Chung updated HDFS-1968: --- Attachment: TestWriteRead-2-Append.patch This is part 2 of the patch. This is the material change portion of code from svn. It is a diff from part 1 (which consist of only the formatting change). So to reveal / commit, apply the 2 patches in this order: a. Apply patch TestWriteRead-1-Format.patch to get to the version with better formating b. Apply patch TestWriteRead-2-Append.patch to get to the version with material changes. (Sorry for the formatting trouble. Next time I'll either do eclipse format right from the start, or never do it! ) > Enhance TestWriteRead to support File Append and Position Read > --- > > Key: HDFS-1968 > URL: https://issues.apache.org/jira/browse/HDFS-1968 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 0.23.0 >Reporter: CW Chung >Assignee: CW Chung >Priority: Minor > Attachments: TestWriteRead-1-Format.patch, > TestWriteRead-2-Append.patch, TestWriteRead.patch, TestWriteRead.patch, > TestWriteRead.patch > > > Desirable to enhance TestWriteRead to support command line options to do: > (1) File Append > (2) Position Read (currently supporting sequential read). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1919: -- Resolution: Duplicate Status: Resolved (was: Patch Available) Cannot reproduce after HDFS-1936 fixed layout version issues. > Upgrade to federated namespace fails > > > Key: HDFS-1919 > URL: https://issues.apache.org/jira/browse/HDFS-1919 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Suresh Srinivas >Priority: Blocker > Fix For: 0.23.0 > > Attachments: hdfs-1919.txt > > > I formatted a namenode running off 0.22 branch, and trying to start it on > trunk yields: > org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory > /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. > It looks like 0.22 has LAYOUT_VERSION -33, but trunk has > LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2014) RPM packages broke bin/hdfs script
[ https://issues.apache.org/jira/browse/HDFS-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042481#comment-13042481 ] Todd Lipcon commented on HDFS-2014: --- actually, this still has an issue in that webapps are not located correctly. bin/hdfs is looking at $HADOOP_PREFIX/build/webapps, which is pointing to COMMON_HOME/build/webapps, rather than HDFS_HOME/build/webapps. > RPM packages broke bin/hdfs script > -- > > Key: HDFS-2014 > URL: https://issues.apache.org/jira/browse/HDFS-2014 > Project: Hadoop HDFS > Issue Type: Bug > Components: scripts >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Eric Yang >Priority: Critical > Fix For: 0.23.0 > > Attachments: HDFS-2014.patch > > > bin/hdfs now appears to depend on ../libexec, which doesn't exist inside of a > source checkout: > todd@todd-w510:~/git/hadoop-hdfs$ ./bin/hdfs namenode > ./bin/hdfs: line 22: > /home/todd/git/hadoop-hdfs/bin/../libexec/hdfs-config.sh: No such file or > directory > ./bin/hdfs: line 138: cygpath: command not found > ./bin/hdfs: line 161: exec: : not found -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1636) If dfs.name.dir points to an empty dir, namenode format shouldn't require confirmation
[ https://issues.apache.org/jira/browse/HDFS-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1636: -- Resolution: Fixed Release Note: If dfs.name.dir points to an empty dir, namenode -format no longer requires confirmation. (was: If dfs.name.dir points to an empty dir, namenode format shouldn't require confirmation.) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Harsh! > If dfs.name.dir points to an empty dir, namenode format shouldn't require > confirmation > -- > > Key: HDFS-1636 > URL: https://issues.apache.org/jira/browse/HDFS-1636 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Harsh J Chouraria >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1636.r1.diff, HDFS-1636.r2.diff, HDFS-1636.r3.diff > > > Right now, running namenode -format when dfs.name.dir is configured to a dir > which exists but is empty still asks for confirmation. This is unnecessary > since it isn't blowing away any real data. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1636) If dfs.name.dir points to an empty dir, namenode format shouldn't require confirmation
[ https://issues.apache.org/jira/browse/HDFS-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042470#comment-13042470 ] Todd Lipcon commented on HDFS-1636: --- +1. I manually tested this patch and it works great. > If dfs.name.dir points to an empty dir, namenode format shouldn't require > confirmation > -- > > Key: HDFS-1636 > URL: https://issues.apache.org/jira/browse/HDFS-1636 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Harsh J Chouraria >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1636.r1.diff, HDFS-1636.r2.diff, HDFS-1636.r3.diff > > > Right now, running namenode -format when dfs.name.dir is configured to a dir > which exists but is empty still asks for confirmation. This is unnecessary > since it isn't blowing away any real data. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.
[ https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042464#comment-13042464 ] Todd Lipcon commented on HDFS-1936: --- +1 on the 0.22 patch. We should probably add an 0.20.0 and 0.20.203 image tarball to these tests, too, given we have the infrastructure, but we can do that separately for sure. > Updating the layout version from HDFS-1822 causes upgrade problems. > --- > > Key: HDFS-1936 > URL: https://issues.apache.org/jira/browse/HDFS-1936 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas >Priority: Blocker > Fix For: 0.22.0, 0.23.0 > > Attachments: HDFS-1936.3.patch, HDFS-1936.4.patch, HDFS-1936.6.patch, > HDFS-1936.6.patch, HDFS-1936.7.patch, HDFS-1936.8.patch, HDFS-1936.9.patch, > HDFS-1936.rel22.patch, HDFS-1936.trunk.patch, hadoop-22-dfs-dir.tgz, > hdfs-1936-with-testcase.txt > > > In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk > were changed. Some of the namenode logic that depends on layout version is > broken because of this. Read the comment for more description. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read
[ https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CW Chung updated HDFS-1968: --- Attachment: TestWriteRead-1-Format.patch I have patch available to address Comment # 1 by John. To address the comment of Cos, I am dividing the patch into two parts: a. A patch (this file) to just re-format of the existing svn copy (basically eclipse format + some manual fix up). Since there is no material code change here, the hope is to get this committed quickly, then step b can be started. b. I then generate a patch on a (basically diff my latest version with the new formatted version). This patch would require real review. > Enhance TestWriteRead to support File Append and Position Read > --- > > Key: HDFS-1968 > URL: https://issues.apache.org/jira/browse/HDFS-1968 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 0.23.0 >Reporter: CW Chung >Assignee: CW Chung >Priority: Minor > Attachments: TestWriteRead-1-Format.patch, TestWriteRead.patch, > TestWriteRead.patch, TestWriteRead.patch > > > Desirable to enhance TestWriteRead to support command line options to do: > (1) File Append > (2) Position Read (currently supporting sequential read). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042445#comment-13042445 ] Eli Collins commented on HDFS-988: -- ELOS#flush calls ELFOS#flushAndSync which does a force on the underlying file channel. > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, > saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1401) TestFileConcurrentReader test case is still timing out / failing
[ https://issues.apache.org/jira/browse/HDFS-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley resolved HDFS-1401. -- Resolution: Cannot Reproduce TestFileConcurrentReader has not had a failure in the last 60+ builds over 9 days. I think the underlying cause has been fixed around build 601/605. Closing this ticket. > TestFileConcurrentReader test case is still timing out / failing > > > Key: HDFS-1401 > URL: https://issues.apache.org/jira/browse/HDFS-1401 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs client >Affects Versions: 0.22.0 >Reporter: Tanping Wang >Priority: Critical > Attachments: HDFS-1401.patch > > > The unit test case, TestFileConcurrentReader after its most recent fix in > HDFS-1310 still times out when using java 1.6.0_07. When using java > 1.6.0_07, the test case simply hangs. On apache Hudson build ( which > possibly is using a higher sub-version of java) this test case has presented > an inconsistent test result that it sometimes passes, some times fails. For > example, between the most recent build 423, 424 and build 425, there is no > effective change, however, the test case failed on build 424 and passed on > build 425 > build 424 test failed > https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/424/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/ > build 425 test passed > https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/425/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1907) BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush
[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042433#comment-13042433 ] John George commented on HDFS-1907: --- Thanks Daryn. Attaching another patch taking Daryns comments and also enabling position based testing in TestWriteRead.java > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-1907-2.patch, HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*1; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1907) BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush
[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated HDFS-1907: -- Attachment: HDFS-1907-2.patch > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-1907-2.patch, HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*1; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042432#comment-13042432 ] Bharath Mundlapudi commented on HDFS-988: - I am just wondering, if we are calling os sync at all on this code path. All i see is flush call which flushes from EditLogOutputStream (java buffers) to kernel buffers. Shouldn't we be doing the following? eStream.flush(); eStream.getFileOutputStream().getFD().sync(); This will make sure the edits are actually written to disk. Is there any reason for not doing this? > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, > saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read
[ https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042431#comment-13042431 ] John George commented on HDFS-1968: --- If you like you can do #2 as another JIRA. #1 should actually be part of the corresponding JIRAs that you filed and hence you can ignore that too. > Enhance TestWriteRead to support File Append and Position Read > --- > > Key: HDFS-1968 > URL: https://issues.apache.org/jira/browse/HDFS-1968 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 0.23.0 >Reporter: CW Chung >Assignee: CW Chung >Priority: Minor > Attachments: TestWriteRead.patch, TestWriteRead.patch, > TestWriteRead.patch > > > Desirable to enhance TestWriteRead to support command line options to do: > (1) File Append > (2) Position Read (currently supporting sequential read). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1998) make refresh-namodenodes.sh refreshing all namenodes
[ https://issues.apache.org/jira/browse/HDFS-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042409#comment-13042409 ] Suresh Srinivas commented on HDFS-1998: --- # Could you please add a unit test for the new method. # Why are you printing empty string as error in NNRpcAddressesCommandHandler? # Command description "name node" to "namenode" # In the script you are setting errorFlag before for loop. But you are not using that value and still enter for loop? > make refresh-namodenodes.sh refreshing all namenodes > > > Key: HDFS-1998 > URL: https://issues.apache.org/jira/browse/HDFS-1998 > Project: Hadoop HDFS > Issue Type: Bug > Components: scripts >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1998.patch > > > refresh-namenodes.sh is used to refresh name nodes in the cluster to check > for updates of include/exclude list. It is used when decommissioning or > adding a data node. Currently it only refreshes the name node who serves the > defaultFs, if there is defaultFs defined. Fix it by refreshing all the name > nodes in the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2021) TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042404#comment-13042404 ] Hadoop QA commented on HDFS-2021: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481127/HDFS-2021-2.patch against trunk revision 1130262. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/675//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/675//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/675//console This message is automatically generated. > TestWriteRead failed with inconsistent visible length of a file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-2021-2.patch, HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042402#comment-13042402 ] Hadoop QA commented on HDFS-2011: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481129/HDFS-2011.3.patch against trunk revision 1130262. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/676//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/676//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/676//console This message is automatically generated. > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.3.patch, HDFS-2011.patch, HDFS-2011.patch, > HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2021) TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042384#comment-13042384 ] Daryn Sharp commented on HDFS-2021: --- +1 Looks good. Presumably increasing the number of writes and the chunk size is to more easily induce the problem. I hope it doesn't add much runtime to the test suite... > TestWriteRead failed with inconsistent visible length of a file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-2021-2.patch, HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2021) TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2021: - Component/s: data-node Priority: Major (was: Minor) Summary: TestWriteRead failed with inconsistent visible length of a file (was: HDFS Junit test TestWriteRead failed with inconsistent visible length of a file ) > TestWriteRead failed with inconsistent visible length of a file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-2021-2.patch, HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-2011: --- Attachment: HDFS-2011.3.patch Updated patch. Fixed some things I looked over. > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.3.patch, HDFS-2011.patch, HDFS-2011.patch, > HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042365#comment-13042365 ] Hadoop QA commented on HDFS-2011: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481119/HDFS-2011.patch against trunk revision 1129942. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/674//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/674//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/674//console This message is automatically generated. > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.patch, HDFS-2011.patch, HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1954) Improve corrupt files warning message
[ https://issues.apache.org/jira/browse/HDFS-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042364#comment-13042364 ] Konstantin Shvachko commented on HDFS-1954: --- Yes, that sounds good. > Improve corrupt files warning message > - > > Key: HDFS-1954 > URL: https://issues.apache.org/jira/browse/HDFS-1954 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: philo vivero >Assignee: Patrick Hunt > Fix For: 0.22.0 > > Attachments: HDFS-1954.patch, HDFS-1954.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > On NameNode web interface, you may get this warning: > WARNING : There are about 32 missing blocks. Please check the log or run > fsck. > If the cluster was started less than 14 days before, it would be great to > add: "Is dfs.data.dir defined?" > If at the point of that error message, that parameter could be checked, and > error made "OMG dfs.data.dir isn't defined!" that'd be even better. As is, > troubleshooting undefined parameters is a difficult proposition. > I suspect this is an easy fix. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2021) HDFS Junit test TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated HDFS-2021: -- Attachment: HDFS-2021-2.patch attached a newer patch with the comment from Daryn and also modified TestWriteRead.java to add the unit test for this. > HDFS Junit test TestWriteRead failed with inconsistent visible length of a > file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George >Priority: Minor > Attachments: HDFS-2021-2.patch, HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1934) Fix NullPointerException when certain File APIs return null
[ https://issues.apache.org/jira/browse/HDFS-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042361#comment-13042361 ] Matt Foley commented on HDFS-1934: -- The test failures are unrelated. +1. Committed to trunk. Thanks Bharath! And thanks to Jakob for reviewing. > Fix NullPointerException when certain File APIs return null > --- > > Key: HDFS-1934 > URL: https://issues.apache.org/jira/browse/HDFS-1934 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0 >Reporter: Bharath Mundlapudi >Assignee: Bharath Mundlapudi > Fix For: 0.23.0 > > Attachments: HDFS-1934-1.patch, HDFS-1934-2.patch, HDFS-1934-3.patch, > HDFS-1934-4.patch, HDFS-1934-5.patch > > > While testing Disk Fail Inplace, We encountered the NPE from this part of the > code. > File[] files = dir.listFiles(); > for (File f : files) { > ... > } > This is kinda of an API issue. When a disk is bad (or name is not a > directory), this API (listFiles, list) return null rather than throwing an > exception. This 'for loop' throws a NPE exception. And same applies to > dir.list() API. > Fix all the places where null condition was not checked. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1986) Add an option for user to return http or https ports regardless of security is on/off in DFSUtil.getInfoServer()
[ https://issues.apache.org/jira/browse/HDFS-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042358#comment-13042358 ] Hadoop QA commented on HDFS-1986: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480182/HDFS-1986.patch against trunk revision 1129942. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/673//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/673//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/673//console This message is automatically generated. > Add an option for user to return http or https ports regardless of security > is on/off in DFSUtil.getInfoServer() > > > Key: HDFS-1986 > URL: https://issues.apache.org/jira/browse/HDFS-1986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1986.patch > > > Currently DFSUtil.getInfoServer gets http port with security off and httpS > port with security on. However, we want to return http port regardless of > security on/off for Cluster UI to use. Add in a third Boolean parameter for > user to decide whether to check security or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042356#comment-13042356 ] Hadoop QA commented on HDFS-2011: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481110/HDFS-2011.patch against trunk revision 1129942. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestHFlush +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/672//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/672//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/672//console This message is automatically generated. > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.patch, HDFS-2011.patch, HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042351#comment-13042351 ] Eli Collins commented on HDFS-988: -- It looks like most of the unprotected* methods take the rwlock, but don't need to because either because their caller takes the lock or they are called from loading the edit log (which is why we originally had unprotected versions). Do people mind if I fix that up (remove the locking from these methods, make sure the unprotected versions are only called when loading the log) in this change or do people want that done in a separate change? > saveNamespace can corrupt edits log, apparently due to race conditions > -- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20-append, 0.21.0, 0.22.0 >Reporter: dhruba borthakur >Assignee: Eli Collins >Priority: Blocker > Fix For: 0.20-append, 0.22.0 > > Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, > hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, > saveNamespace_20-append.patch > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2022) ant binary fails due to missing c++ lib dir
[ https://issues.apache.org/jira/browse/HDFS-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042344#comment-13042344 ] Owen O'Malley commented on HDFS-2022: - It sounds reasonable for the bin-package depend on the compile-c++-libhdfs. > ant binary fails due to missing c++ lib dir > --- > > Key: HDFS-2022 > URL: https://issues.apache.org/jira/browse/HDFS-2022 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 0.23.0 >Reporter: Eli Collins > Fix For: 0.23.0 > > > Post HDFS-1963 ant binary fails w/ the following. The bin-package is trying > to copy from the c++ lib dir which doesn't exist yet. The binary target > should check for the existence of this dir or would also be reasonable to > depend on the compile-c++-libhdfs (since this is the binary target). > {noformat} > /home/eli/src/hdfs4/build.xml:1115: > /home/eli/src/hdfs4/build/c++/Linux-amd64-64/lib not found. > {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2022) ant binary fails due to missing c++ lib dir
ant binary fails due to missing c++ lib dir --- Key: HDFS-2022 URL: https://issues.apache.org/jira/browse/HDFS-2022 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Eli Collins Fix For: 0.23.0 Post HDFS-1963 ant binary fails w/ the following. The bin-package is trying to copy from the c++ lib dir which doesn't exist yet. The binary target should check for the existence of this dir or would also be reasonable to depend on the compile-c++-libhdfs (since this is the binary target). {noformat} /home/eli/src/hdfs4/build.xml:1115: /home/eli/src/hdfs4/build/c++/Linux-amd64-64/lib not found. {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-2011: --- Attachment: HDFS-2011.patch Granting license to ASF. > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.patch, HDFS-2011.patch, HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1580) Add interface for generic Write Ahead Logging mechanisms
[ https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042313#comment-13042313 ] Ivan Kelly commented on HDFS-1580: -- @Jitendra (1) should work for checkpointing, as if you journal A has more edits than journal B while counting the in_progress file, it will have more or an equal number not counting the in_progress file. More in the case that B has gaps in which case it throws an exception, equal otherwise. So we finalise inprogress when we open a write and spot an inprogress file. I guess this should only happen on startup after a crash. The writer shouldn't finalise an inprogress if something else is writing to it. We have nothing to prevent this now, but if this is happening, your system is broken. Fencing could be implemented later to explicitly exclude this possibility. > Add interface for generic Write Ahead Logging mechanisms > > > Key: HDFS-1580 > URL: https://issues.apache.org/jira/browse/HDFS-1580 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ivan Kelly > Fix For: Edit log branch (HDFS-1073) > > Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, > HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, > generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, > generic_wal_iface.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2021) HDFS Junit test TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042298#comment-13042298 ] Tsz Wo (Nicholas), SZE commented on HDFS-2021: -- Daryn, I agree that we need {{replyAck.isSuccess()}}. {quote} That said, I'm a bit confused about why a datanode updates its bytesAcked iff all downstreams are successful. ... bytesAcked is intended to track exactly how many bytes were written throughout the entire pipeline ... {quote} You are totally correct that it is the intention; see Section 3.3 in the [Append Design Doc|https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf] in HDFS-265. > HDFS Junit test TestWriteRead failed with inconsistent visible length of a > file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George >Priority: Minor > Attachments: HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1986) Add an option for user to return http or https ports regardless of security is on/off in DFSUtil.getInfoServer()
[ https://issues.apache.org/jira/browse/HDFS-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042295#comment-13042295 ] Suresh Srinivas commented on HDFS-1986: --- Comments: # TestDFSUtil.java - {{InetSocketAddress is = new InetSocketAddress(1234);}} I am not clear how this maps to namenode address? # DFSUtil.java - {{checkSecurity}} could be named {{httpsAddress}}. @param checkSecurity needs to be reworded. The method returns an address and not port. @return needs to be reworded too. > Add an option for user to return http or https ports regardless of security > is on/off in DFSUtil.getInfoServer() > > > Key: HDFS-1986 > URL: https://issues.apache.org/jira/browse/HDFS-1986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1986.patch > > > Currently DFSUtil.getInfoServer gets http port with security off and httpS > port with security on. However, we want to return http port regardless of > security on/off for Cluster UI to use. Add in a third Boolean parameter for > user to decide whether to check security or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2013) Recurring failure of TestMissingBlocksAlert on branch-0.22
[ https://issues.apache.org/jira/browse/HDFS-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042290#comment-13042290 ] Suresh Srinivas commented on HDFS-2013: --- This could be related to HDFS-1954 change? > Recurring failure of TestMissingBlocksAlert on branch-0.22 > -- > > Key: HDFS-2013 > URL: https://issues.apache.org/jira/browse/HDFS-2013 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node, test >Affects Versions: 0.22.0 >Reporter: Aaron T. Myers > Fix For: 0.22.0 > > > This has been failing on Hudson for the last two builds and fails on my local > box as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1986) Add an option for user to return http or https ports regardless of security is on/off in DFSUtil.getInfoServer()
[ https://issues.apache.org/jira/browse/HDFS-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1986: -- Status: Patch Available (was: Open) > Add an option for user to return http or https ports regardless of security > is on/off in DFSUtil.getInfoServer() > > > Key: HDFS-1986 > URL: https://issues.apache.org/jira/browse/HDFS-1986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.0 >Reporter: Tanping Wang >Assignee: Tanping Wang >Priority: Minor > Fix For: 0.23.0 > > Attachments: HDFS-1986.patch > > > Currently DFSUtil.getInfoServer gets http port with security off and httpS > port with security on. However, we want to return http port regardless of > security on/off for Cluster UI to use. Add in a third Boolean parameter for > user to decide whether to check security or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2017) A partial rollback cause the new changes done after upgrade to be visible after rollback
[ https://issues.apache.org/jira/browse/HDFS-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042289#comment-13042289 ] Suresh Srinivas commented on HDFS-2017: --- Not clear about he problem you are describing. > 2) Namenode starts and new files written .. You mean to say here the upgrade failed but Namenode started functioning? > But if a ROLLBACK is done , the 1st dir will be rolled back (the older copy > becomes current and its checkpointtime is now LESS than other dirs ..) and > others left behind since they dont contain previous How is this possible? The directory that is rolled back will be consistent with the directories that were not upgraded previously and hence are not rolled back. > New changes lost after rollback During rollback new changes are indeed lost. > A partial rollback cause the new changes done after upgrade to be visible > after rollback > > > Key: HDFS-2017 > URL: https://issues.apache.org/jira/browse/HDFS-2017 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.20.1 >Reporter: HariSree >Priority: Minor > Labels: rollback, upgrade > > This is the scenario : > Namenode has 3 name dirs configured .. > 1) Namenode upgrade starts - Upgrade fails after 1st directory is upgraded > (2nd and 3rd dir is left unchanged ..) { like , Namenode process down } > 2) Namenode starts and new files written .. > 3) Namenode shutdown and rollbacked > Since Namenode is saving the latest image dir(the upgraded 1st dir since > checkpointtime is incremented during upgrade for this dir) will be loaded and > saved to all dirs during loadfsimage .. > But if a ROLLBACK is done , the 1st dir will be rolled back (the older copy > becomes current and its checkpointtime is now LESS than other dirs ..) and > others left behind since they dont contain previous .. Now during loadfsimage > , the 2nd dir will be selected since it has the highest checkpoint time and > saved to all dirs (including 1st ) .. Now due to this , the new changes b/w > UPGRADE and ROLLBACK present in 2nd dir gets reflected even after ROLLBACK .. > > This is not the case with a SUCCESSFUL Upgrade/Rollback (New changes lost > after rollback).. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042285#comment-13042285 ] Ravi Prakash commented on HDFS-2011: I ran test-patch. Also ran ant-test and no new test failures have been introduced. Can someone please review / commit the patch? > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.patch, HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
[ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-2011: --- Attachment: HDFS-2011.patch HDFS-2011.patch > Removal and restoration of storage directories on checkpointing failure > doesn't work properly > - > > Key: HDFS-2011 > URL: https://issues.apache.org/jira/browse/HDFS-2011 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-2011.patch, HDFS-2011.patch > > > Removal and restoration of storage directories on checkpointing failure > doesn't work properly. Sometimes it throws a NullPointerException and > sometimes it doesn't take off a failed storage directory -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.
[ https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042272#comment-13042272 ] Suresh Srinivas commented on HDFS-1936: --- I ran unit tests on 0.22 patch. The following tests fail - TestHDFSTrash, TestMissingBlocksAlert. The first is a know failure, the second one could be from HDFS-1954? Todd, can you please review the 0.22 patch? > Updating the layout version from HDFS-1822 causes upgrade problems. > --- > > Key: HDFS-1936 > URL: https://issues.apache.org/jira/browse/HDFS-1936 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 0.23.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas >Priority: Blocker > Fix For: 0.22.0, 0.23.0 > > Attachments: HDFS-1936.3.patch, HDFS-1936.4.patch, HDFS-1936.6.patch, > HDFS-1936.6.patch, HDFS-1936.7.patch, HDFS-1936.8.patch, HDFS-1936.9.patch, > HDFS-1936.rel22.patch, HDFS-1936.trunk.patch, hadoop-22-dfs-dir.tgz, > hdfs-1936-with-testcase.txt > > > In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk > were changed. Some of the namenode logic that depends on layout version is > broken because of this. Read the comment for more description. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read
[ https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042246#comment-13042246 ] John George commented on HDFS-1968: --- CW, ignore my comment #3, It will be better if those changes are made in the corresponding JIRAs themselves to avoid any dependency between these JIRAS. > Enhance TestWriteRead to support File Append and Position Read > --- > > Key: HDFS-1968 > URL: https://issues.apache.org/jira/browse/HDFS-1968 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 0.23.0 >Reporter: CW Chung >Assignee: CW Chung >Priority: Minor > Attachments: TestWriteRead.patch, TestWriteRead.patch, > TestWriteRead.patch > > > Desirable to enhance TestWriteRead to support command line options to do: > (1) File Append > (2) Position Read (currently supporting sequential read). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1907) BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush
[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042220#comment-13042220 ] Daryn Sharp commented on HDFS-1907: --- I'd suggest a temporary boolean for whether the read went past the end of the finalized block. You might even consider simplifying all the logic to: {code} -final List blocks; -if (locatedBlocks.isLastBlockComplete()) { - blocks = getFinalizedBlockRange(offset, length); -} -else { - if (length + offset > locatedBlocks.getFileLength()) { -length = locatedBlocks.getFileLength() - offset; - } - blocks = getFinalizedBlockRange(offset, length); +boolean readPastEnd = (offset + length > locatedBlocks.getFileLength()); +if (readPastEnd) length = locatedBlocks.getFileLength() - offset; + +final List blocks = getFinalizedBlockRange(offset, length); +if (readPastEnd && !locatedBlocks.isLastBlockComplete()) { blocks.add(locatedBlocks.getLastLocatedBlock()); } {code} > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*1; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2021) HDFS Junit test TestWriteRead failed with inconsistent visible length of a file
[ https://issues.apache.org/jira/browse/HDFS-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042208#comment-13042208 ] Daryn Sharp commented on HDFS-2021: --- I noticed that you omitted the conditional {{replyAck.isSuccess()}} when you moved the code block that updates the {{bytesAcked}}. The {{isSuccess()}} isn't tied to whether the ack was successfully sent upstream, but rather whether the downstreams were all successful, thus is seems like the conditional should be reinserted to preserve the current behavior. Changing the overall logic seems fraught with peril... That said, I'm a bit confused about why a datanode updates its {{bytesAcked}} iff all downstreams are successful. The datanode received and wrote those bytes so it seems like the conditional isn't needed in either case. Unless... {{bytesAcked}} is intended to track exactly how many bytes were written throughout the entire pipeline. I'd think that a pipeline should write as much as it can even if downstreams are lost, then backfill the under-replicated blocks. To satisfy curiosity, perhaps someone with more knowledge of the code will comment. > HDFS Junit test TestWriteRead failed with inconsistent visible length of a > file > > > Key: HDFS-2021 > URL: https://issues.apache.org/jira/browse/HDFS-2021 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Linux RHEL5 >Reporter: CW Chung >Assignee: John George >Priority: Minor > Attachments: HDFS-2021.patch > > > The junit test failed when iterates a number of times with larger chunk size > on Linux. Once a while, the visible number of bytes seen by a reader is > slightly less than what was supposed to be. > When run with the following parameter, it failed more often on Linux ( as > reported by John George) than my Mac: > private static final int WR_NTIMES = 300; > private static final int WR_CHUNK_SIZE = 1; > Adding more debugging output to the source, this is a sample of the output: > Caused by: java.io.IOException: readData mismatch in byte read: > expected=277 ; got 2765312 > at > org.apache.hadoop.hdfs.TestWriteRead.readData(TestWriteRead.java:141) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1907) BlockMissingException upon concurrent read and write: reader was doing file position read while writer is doing write without hflush
[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042180#comment-13042180 ] John George commented on HDFS-1907: --- The test in HDFS-1968 is the one that caused this bug to show up. So, hoping to use the same as unit test and hence no additional tests added. I dont think TestDFSUpgradeFromImage failure was caused by this patch since it fails in build #669 as well. > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. >Reporter: CW Chung >Assignee: John George > Attachments: HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*1; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira