[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071400#comment-14071400 ] Vinayakumar B commented on HDFS-5919: - Hi [~umamaheswararao], Please can you take a look at the patch? Thanks FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6719: Attachment: HDFS-6719.005.patch org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071405#comment-14071405 ] Yongjun Zhang commented on HDFS-6719: - I think I found the root cause of this intermittent failure: the test root dir set with system property test.build.data is not created by this test, as it is supposed to do. And the reason that it can succeed often is because all tests are run together, the test root dir used by this test is actually created by other test. It fails sometimes because the test root dir can be removed by some tests (depending on the order that the tests are run). So the solution is to create the test root. Posted patch 004. org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071408#comment-14071408 ] Yongjun Zhang commented on HDFS-6719: - BTW, this should be changed to hadoop common jira, and thanks for reviewing. org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071409#comment-14071409 ] Liang Xie commented on HDFS-6735: - bq. We'd not check in the test since it does not assert anything? We'd just check it in as a utility testing concurrent pread throughput? In the last of testing code snippet, there're assertion, see: {code} assertTrue(readLatency.readMs readLatency.readMs); //because we issued a pread already, so the second one should not hit //disk, even consider running on a slow VM, 1 second should be fine? assertTrue(readLatency.preadMs 1000); {code} Per assertTrue(readLatency.preadMs 1000);, we could know weather the pread() be blocked by read() or not :) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6737) DFSClinet should use IV generated beased on the configured CipherSuite with codecs used
Uma Maheswara Rao G created HDFS-6737: - Summary: DFSClinet should use IV generated beased on the configured CipherSuite with codecs used Key: HDFS-6737 URL: https://issues.apache.org/jira/browse/HDFS-6737 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Seems like we are using IV as like Encrypted data encryption key iv. But the underlying Codec's cipher suite may expect different iv length. So, we should generate IV from the Coec's cipher suite configured. {code} final CryptoInputStream cryptoIn = new CryptoInputStream(dfsis, CryptoCodec.getInstance(conf, feInfo.getCipherSuite()), feInfo.getEncryptedDataEncryptionKey(), feInfo.getIV()); {code} So, instead of using feinfo.getIV(), we should generate like {code} byte[] iv = new byte[codec.getCipherSuite().getAlgorithmBlockSize()]; codec.generateSecureRandom(iv); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
Uma Maheswara Rao G created HDFS-6738: - Summary: Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-6738: -- Priority: Minor (was: Major) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6686) Archival Storage: Use fallback storage types
[ https://issues.apache.org/jira/browse/HDFS-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071430#comment-14071430 ] Tsz Wo Nicholas Sze commented on HDFS-6686: --- Vinay, # Good catch. We should pass excludedNodes. # For getAdditionalDatanode(), since the write is already started, there are already some data in the block. So using replication fallback is correct. Creation fallback usually is a subset of replication fallback since if there are not enough storages available, we may fails creation. However, if we fail replication, it may result in data loss. Arpit, # Good catch. We need to pass newBlock as a parameter. # numOfResults indeed is the number of existing replicas (numOfResults is used to determine whether local host/local rack/remote rack should be chosen from). numOfReplicas (call it n) is the number of replicas to be chosen. In our case, we try to select n storage types but we may only able to get m n, where m = storageTypes.size(). So we should update numOfReplicas. Thanks both of you for the careful reviews! Archival Storage: Use fallback storage types Key: HDFS-6686 URL: https://issues.apache.org/jira/browse/HDFS-6686 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6686_20140721.patch, h6686_20140721c.patch HDFS-6671 changes replication monitor to use block storage policy for replication. It should also use the fallback storage types when a particular type of storage is full. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6686) Archival Storage: Use fallback storage types
[ https://issues.apache.org/jira/browse/HDFS-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6686: -- Attachment: h6686_20140723.patch h6686_20140723.patch: incorporated the comments from Vinay and Arpit. Archival Storage: Use fallback storage types Key: HDFS-6686 URL: https://issues.apache.org/jira/browse/HDFS-6686 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6686_20140721.patch, h6686_20140721c.patch, h6686_20140723.patch HDFS-6671 changes replication monitor to use block storage policy for replication. It should also use the fallback storage types when a particular type of storage is full. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071435#comment-14071435 ] Hadoop QA commented on HDFS-6735: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657271/HDFS-6735.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeConfig org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestPread org.apache.hadoop.hdfs.TestDataTransferKeepalive {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7431//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7431//console This message is automatically generated. A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071440#comment-14071440 ] Liang Xie commented on HDFS-6735: - bq. I'd think you'd want a comment at least in LocatedBlocks#underConstruction warning an upper layer is dependent on it being final in case LocatedBlocks changes and starts to allow blocks complete under a stream. done. bq. locatedBlocks.insertRange(targetBlockIdx, newBlocks.getLocatedBlocks()); ... be inside a synchronization too? Could two threads be updating block locations at same time? yes, it's possible. but we could not put a synchronization that, it's different from synchronized (this) { + pos = offset; + blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1; + currentLocatedBlock = blk; + }, because in pread scenario, the updatePosition is false, so will never go into the synchronized (this) { + pos = offset; + blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1; + currentLocatedBlock = blk; + }. And if we put a synchronization there, so if pread reach here, it's still blocked by other monitor holder, e.g. read() :) but we can have a synchronized or rwLock in Locatedblocks class. let me try A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6735: Attachment: HDFS-6735-v2.txt A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071472#comment-14071472 ] Hadoop QA commented on HDFS-6698: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657281/HDFS-6698.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7433//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7433//console This message is automatically generated. try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt, HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071475#comment-14071475 ] Liang Xie commented on HDFS-6698: - TestPipelinesFailover probably is due to ulimit setting, i had seen several recent reports failed on it with too many open files try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt, HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-5723: Attachment: HDFS-5723.patch Attached rebased patch. Hi [~umamaheswararao], Can you please take a look at the patch? Thanks in advance. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6703) NFS: Files can be deleted from a read-only mount
[ https://issues.apache.org/jira/browse/HDFS-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071501#comment-14071501 ] Srikanth Upputuri commented on HDFS-6703: - Thanks [~brandonli] and [~abutala] for your quick responses and support! NFS: Files can be deleted from a read-only mount Key: HDFS-6703 URL: https://issues.apache.org/jira/browse/HDFS-6703 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Abhiraj Butala Assignee: Srikanth Upputuri Fix For: 2.5.0 Attachments: HDFS-6703.patch As reported by bigdatagroup bigdatagr...@itecons.it on hadoop-users mailing list: {code} We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153 ro/value /property As you can see, we expect the exported FS to be read-only, but in fact we are able to delete files and folders stored on it (where the user has the correct permissions), from the client machine that mounted the FS. Other writing operations are correctly blocked. Hadoop Version in use: 2.3.0+cdh5.0.1+567 {code} I was able to reproduce the issue on latest hadoop trunk. Though I could only delete files, deleting directories were correctly blocked: {code} abutala@abutala-vBox:/mnt/hdfs$ mount | grep 127 127.0.1.1:/ on /mnt/hdfs type nfs (rw,vers=3,proto=tcp,nolock,addr=127.0.1.1) abutala@abutala-vBox:/mnt/hdfs$ ls -lh total 512 -rw-r--r-- 1 abutala supergroup 0 Jul 17 18:51 abc.txt drwxr-xr-x 2 abutala supergroup 64 Jul 17 18:31 temp abutala@abutala-vBox:/mnt/hdfs$ rm abc.txt abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ rm -r temp rm: cannot remove `temp': Permission denied abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ {code} Contents of hdfs-site.xml: {code} configuration property namedfs.nfs3.dump.dir/name value/tmp/.hdfs-nfs3/value /property property namedfs.nfs.exports.allowed.hosts/name valuelocalhost ro/value /property /configuration {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071500#comment-14071500 ] Hadoop QA commented on HDFS-6657: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657283/HDFS-6657.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7434//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7434//console This message is automatically generated. Remove link to 'Legacy UI' in trunk's Namenode UI - Key: HDFS-6657 URL: https://issues.apache.org/jira/browse/HDFS-6657 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: HDFS-6657.patch Link to 'Legacy UI' provided on namenode's UI. Since in trunk, all jsp pages are removed, these links will not work. can be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6686) Archival Storage: Use fallback storage types
[ https://issues.apache.org/jira/browse/HDFS-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071502#comment-14071502 ] Vinayakumar B commented on HDFS-6686: - +1 for Latest patch, lgtm. Thanks [~szetszwo]. Archival Storage: Use fallback storage types Key: HDFS-6686 URL: https://issues.apache.org/jira/browse/HDFS-6686 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6686_20140721.patch, h6686_20140721c.patch, h6686_20140723.patch HDFS-6671 changes replication monitor to use block storage policy for replication. It should also use the fallback storage types when a particular type of storage is full. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071504#comment-14071504 ] Hadoop QA commented on HDFS-6719: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657287/HDFS-6719.005.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ipc.TestIPC {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7437//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7437//console This message is automatically generated. org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071510#comment-14071510 ] Hadoop QA commented on HDFS-6247: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657285/HDFS-6247.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7435//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7435//console This message is automatically generated. Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071555#comment-14071555 ] Vinayakumar B commented on HDFS-6247: - Failure, even though related to Balancing, Its not caused by this patch. In fact, its failed due to selection of a block belongs to /system/balancer.id for the movement which is having default replication(3) and after movement it will not be detected as excess. All other blocks in test having 1 replication. So the calculation in TestBalancer#waitForBalancer(..) does not meet and test timesout. I think this can be fixed in a separate jira if observed again. Anyway, triggering the QA again. Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6722) Display readable last contact time for dead nodes on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071567#comment-14071567 ] Vinayakumar B commented on HDFS-6722: - +1, patch looks good to me. I have verified on my machine. Display readable last contact time for dead nodes on NN webUI - Key: HDFS-6722 URL: https://issues.apache.org/jira/browse/HDFS-6722 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6722.patch For dead node info on NN webUI, admins want to know when the nodes became dead, to troubleshoot missing block, etc. Currently the webUI displays the last contact as the unit of seconds since the last contact. It will be useful to display the info in Date format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071577#comment-14071577 ] Hadoop QA commented on HDFS-5919: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628688/HDFS-5919.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7436//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7436//console This message is automatically generated. FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6722) Display readable last contact time for dead nodes on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071588#comment-14071588 ] Hadoop QA commented on HDFS-6722: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657286/HDFS-6722.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.balancer.TestBalancer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7438//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7438//console This message is automatically generated. Display readable last contact time for dead nodes on NN webUI - Key: HDFS-6722 URL: https://issues.apache.org/jira/browse/HDFS-6722 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6722.patch For dead node info on NN webUI, admins want to know when the nodes became dead, to troubleshoot missing block, etc. Currently the webUI displays the last contact as the unit of seconds since the last contact. It will be useful to display the info in Date format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6731) Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception.
[ https://issues.apache.org/jira/browse/HDFS-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071604#comment-14071604 ] Hudson commented on HDFS-6731: -- FAILURE: Integrated in Hadoop-Yarn-trunk #621 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/621/]) HDFS-6731. Run 'hdfs zkfc -formatZK' on a server in a non-namenode will cause a null pointer exception. Contributed by Masatake Iwasaki (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612715) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. Key: HDFS-6731 URL: https://issues.apache.org/jira/browse/HDFS-6731 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover, ha Affects Versions: 2.0.4-alpha, 2.4.0 Reporter: WenJin Ma Assignee: Masatake Iwasaki Fix For: 2.6.0 Attachments: HADOOP-9603-0.patch Original Estimate: 168h Remaining Estimate: 168h Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. {code} [hadoop@test bin]$ ./hdfs zkfc -formatZK Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.init(NNHAServiceTarget.java:57) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:128) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:172) {code} I look at the code, found in the org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs () method does not make judgments on this issue. {code} static String[] getSuffixIDs(final Configuration conf, final String addressKey, String knownNsId, String knownNNId, final AddressMatcher matcher) { String nameserviceId = null; String namenodeId = null; int found = 0; //..do something if (found 1) { // Only one address must match the local address String msg = Configuration has multiple addresses that match + local node's address. Please configure the system with + DFS_NAMESERVICE_ID + and + DFS_HA_NAMENODE_ID_KEY; throw new HadoopIllegalArgumentException(msg); } // If the IP is not a local address, found to be less than 1. // There should be throw an exception with clear message rather than cause a null pointer exception. return new String[] { nameserviceId, namenodeId }; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6703) NFS: Files can be deleted from a read-only mount
[ https://issues.apache.org/jira/browse/HDFS-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071598#comment-14071598 ] Hudson commented on HDFS-6703: -- FAILURE: Integrated in Hadoop-Yarn-trunk #621 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/621/]) HDFS-6703. NFS: Files can be deleted from a read-only mount. Contributed by Srikanth Upputuri (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612702) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestClientAccessPrivilege.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS: Files can be deleted from a read-only mount Key: HDFS-6703 URL: https://issues.apache.org/jira/browse/HDFS-6703 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Abhiraj Butala Assignee: Srikanth Upputuri Fix For: 2.5.0 Attachments: HDFS-6703.patch As reported by bigdatagroup bigdatagr...@itecons.it on hadoop-users mailing list: {code} We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153 ro/value /property As you can see, we expect the exported FS to be read-only, but in fact we are able to delete files and folders stored on it (where the user has the correct permissions), from the client machine that mounted the FS. Other writing operations are correctly blocked. Hadoop Version in use: 2.3.0+cdh5.0.1+567 {code} I was able to reproduce the issue on latest hadoop trunk. Though I could only delete files, deleting directories were correctly blocked: {code} abutala@abutala-vBox:/mnt/hdfs$ mount | grep 127 127.0.1.1:/ on /mnt/hdfs type nfs (rw,vers=3,proto=tcp,nolock,addr=127.0.1.1) abutala@abutala-vBox:/mnt/hdfs$ ls -lh total 512 -rw-r--r-- 1 abutala supergroup 0 Jul 17 18:51 abc.txt drwxr-xr-x 2 abutala supergroup 64 Jul 17 18:31 temp abutala@abutala-vBox:/mnt/hdfs$ rm abc.txt abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ rm -r temp rm: cannot remove `temp': Permission denied abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ {code} Contents of hdfs-site.xml: {code} configuration property namedfs.nfs3.dump.dir/name value/tmp/.hdfs-nfs3/value /property property namedfs.nfs.exports.allowed.hosts/name valuelocalhost ro/value /property /configuration {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6701) Make seed optional in NetworkTopology#sortByDistance
[ https://issues.apache.org/jira/browse/HDFS-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071602#comment-14071602 ] Hudson commented on HDFS-6701: -- FAILURE: Integrated in Hadoop-Yarn-trunk #621 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/621/]) HDFS-6701. Make seed optional in NetworkTopology#sortByDistance. Contributed by Ashwin Shankar. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612625) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestNetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java Make seed optional in NetworkTopology#sortByDistance Key: HDFS-6701 URL: https://issues.apache.org/jira/browse/HDFS-6701 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.5.0 Reporter: Ashwin Shankar Assignee: Ashwin Shankar Fix For: 2.6.0 Attachments: HDFS-6701-v1.txt, HDFS-6701-v3-branch2.txt, HDFS-6701-v3.txt, HDFS-6701-v4-branch2.txt, HDFS-6701-v4.txt Currently seed in NetworkTopology#sortByDistance is set to the blockid which causes the RNG to generate same pseudo random order for each block. If no node local block location is present,this causes the same rack local replica to be hit for a particular block. It'll be good to make the seed optional, so that one could turn it off if they want block locations of a block to be randomized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6712) Document HDFS Multihoming Settings
[ https://issues.apache.org/jira/browse/HDFS-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071603#comment-14071603 ] Hudson commented on HDFS-6712: -- FAILURE: Integrated in Hadoop-Yarn-trunk #621 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/621/]) HDFS-6712. Document HDFS Multihoming Settings. (Contributed by Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612695) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm * /hadoop/common/trunk/hadoop-project/src/site/site.xml Document HDFS Multihoming Settings -- Key: HDFS-6712 URL: https://issues.apache.org/jira/browse/HDFS-6712 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6712.02.patch A few HDFS settings can be changed to enable better support in multi-homed environments. This task is to write a short guide to these settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6704) Fix the command to launch JournalNode in HDFS-HA document
[ https://issues.apache.org/jira/browse/HDFS-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071601#comment-14071601 ] Hudson commented on HDFS-6704: -- FAILURE: Integrated in Hadoop-Yarn-trunk #621 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/621/]) HDFS-6704. Fix the command to launch JournalNode in HDFS-HA document. Contributed by Akira AJISAKA. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612613) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm Fix the command to launch JournalNode in HDFS-HA document - Key: HDFS-6704 URL: https://issues.apache.org/jira/browse/HDFS-6704 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie Fix For: 2.6.0 Attachments: HDFS-6704.2.patch, HDFS-6704.patch In HDFSHighAvailabilityWithQJM.html, {code} After all of the necessary configuration options have been set, you must start the JournalNode daemons on the set of machines where they will run. This can be done by running the command hdfs-daemon.sh journalnode and waiting for the daemon to start on each of the relevant machines. {code} hdfs-daemon.sh should be hadoop-daemon.sh since hdfs-daemon.sh does not exist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071628#comment-14071628 ] Hadoop QA commented on HDFS-6735: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657297/HDFS-6735-v2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7439//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7439//console This message is automatically generated. A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071630#comment-14071630 ] Hadoop QA commented on HDFS-5723: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657310/HDFS-5723.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7440//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7440//console This message is automatically generated. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071655#comment-14071655 ] Liang Xie commented on HDFS-6735: - The failed TestPipelinesFailover TestNamenodeCapacityReport were not related with current patch(i saw them in other recent reports) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071699#comment-14071699 ] Hadoop QA commented on HDFS-6247: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657285/HDFS-6247.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7441//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7441//console This message is automatically generated. Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6712) Document HDFS Multihoming Settings
[ https://issues.apache.org/jira/browse/HDFS-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071734#comment-14071734 ] Hudson commented on HDFS-6712: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1813 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1813/]) HDFS-6712. Document HDFS Multihoming Settings. (Contributed by Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612695) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm * /hadoop/common/trunk/hadoop-project/src/site/site.xml Document HDFS Multihoming Settings -- Key: HDFS-6712 URL: https://issues.apache.org/jira/browse/HDFS-6712 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6712.02.patch A few HDFS settings can be changed to enable better support in multi-homed environments. This task is to write a short guide to these settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6704) Fix the command to launch JournalNode in HDFS-HA document
[ https://issues.apache.org/jira/browse/HDFS-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071732#comment-14071732 ] Hudson commented on HDFS-6704: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1813 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1813/]) HDFS-6704. Fix the command to launch JournalNode in HDFS-HA document. Contributed by Akira AJISAKA. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612613) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm Fix the command to launch JournalNode in HDFS-HA document - Key: HDFS-6704 URL: https://issues.apache.org/jira/browse/HDFS-6704 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie Fix For: 2.6.0 Attachments: HDFS-6704.2.patch, HDFS-6704.patch In HDFSHighAvailabilityWithQJM.html, {code} After all of the necessary configuration options have been set, you must start the JournalNode daemons on the set of machines where they will run. This can be done by running the command hdfs-daemon.sh journalnode and waiting for the daemon to start on each of the relevant machines. {code} hdfs-daemon.sh should be hadoop-daemon.sh since hdfs-daemon.sh does not exist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6703) NFS: Files can be deleted from a read-only mount
[ https://issues.apache.org/jira/browse/HDFS-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071729#comment-14071729 ] Hudson commented on HDFS-6703: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1813 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1813/]) HDFS-6703. NFS: Files can be deleted from a read-only mount. Contributed by Srikanth Upputuri (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612702) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestClientAccessPrivilege.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS: Files can be deleted from a read-only mount Key: HDFS-6703 URL: https://issues.apache.org/jira/browse/HDFS-6703 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Abhiraj Butala Assignee: Srikanth Upputuri Fix For: 2.5.0 Attachments: HDFS-6703.patch As reported by bigdatagroup bigdatagr...@itecons.it on hadoop-users mailing list: {code} We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153 ro/value /property As you can see, we expect the exported FS to be read-only, but in fact we are able to delete files and folders stored on it (where the user has the correct permissions), from the client machine that mounted the FS. Other writing operations are correctly blocked. Hadoop Version in use: 2.3.0+cdh5.0.1+567 {code} I was able to reproduce the issue on latest hadoop trunk. Though I could only delete files, deleting directories were correctly blocked: {code} abutala@abutala-vBox:/mnt/hdfs$ mount | grep 127 127.0.1.1:/ on /mnt/hdfs type nfs (rw,vers=3,proto=tcp,nolock,addr=127.0.1.1) abutala@abutala-vBox:/mnt/hdfs$ ls -lh total 512 -rw-r--r-- 1 abutala supergroup 0 Jul 17 18:51 abc.txt drwxr-xr-x 2 abutala supergroup 64 Jul 17 18:31 temp abutala@abutala-vBox:/mnt/hdfs$ rm abc.txt abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ rm -r temp rm: cannot remove `temp': Permission denied abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ {code} Contents of hdfs-site.xml: {code} configuration property namedfs.nfs3.dump.dir/name value/tmp/.hdfs-nfs3/value /property property namedfs.nfs.exports.allowed.hosts/name valuelocalhost ro/value /property /configuration {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6731) Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception.
[ https://issues.apache.org/jira/browse/HDFS-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071735#comment-14071735 ] Hudson commented on HDFS-6731: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1813 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1813/]) HDFS-6731. Run 'hdfs zkfc -formatZK' on a server in a non-namenode will cause a null pointer exception. Contributed by Masatake Iwasaki (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612715) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. Key: HDFS-6731 URL: https://issues.apache.org/jira/browse/HDFS-6731 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover, ha Affects Versions: 2.0.4-alpha, 2.4.0 Reporter: WenJin Ma Assignee: Masatake Iwasaki Fix For: 2.6.0 Attachments: HADOOP-9603-0.patch Original Estimate: 168h Remaining Estimate: 168h Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. {code} [hadoop@test bin]$ ./hdfs zkfc -formatZK Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.init(NNHAServiceTarget.java:57) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:128) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:172) {code} I look at the code, found in the org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs () method does not make judgments on this issue. {code} static String[] getSuffixIDs(final Configuration conf, final String addressKey, String knownNsId, String knownNNId, final AddressMatcher matcher) { String nameserviceId = null; String namenodeId = null; int found = 0; //..do something if (found 1) { // Only one address must match the local address String msg = Configuration has multiple addresses that match + local node's address. Please configure the system with + DFS_NAMESERVICE_ID + and + DFS_HA_NAMENODE_ID_KEY; throw new HadoopIllegalArgumentException(msg); } // If the IP is not a local address, found to be less than 1. // There should be throw an exception with clear message rather than cause a null pointer exception. return new String[] { nameserviceId, namenodeId }; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6701) Make seed optional in NetworkTopology#sortByDistance
[ https://issues.apache.org/jira/browse/HDFS-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071733#comment-14071733 ] Hudson commented on HDFS-6701: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1813 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1813/]) HDFS-6701. Make seed optional in NetworkTopology#sortByDistance. Contributed by Ashwin Shankar. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612625) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestNetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java Make seed optional in NetworkTopology#sortByDistance Key: HDFS-6701 URL: https://issues.apache.org/jira/browse/HDFS-6701 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.5.0 Reporter: Ashwin Shankar Assignee: Ashwin Shankar Fix For: 2.6.0 Attachments: HDFS-6701-v1.txt, HDFS-6701-v3-branch2.txt, HDFS-6701-v3.txt, HDFS-6701-v4-branch2.txt, HDFS-6701-v4.txt Currently seed in NetworkTopology#sortByDistance is set to the blockid which causes the RNG to generate same pseudo random order for each block. If no node local block location is present,this causes the same rack local replica to be hit for a particular block. It'll be good to make the seed optional, so that one could turn it off if they want block locations of a block to be randomized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6737) DFSClinet should use IV generated beased on the configured CipherSuite with codecs used
[ https://issues.apache.org/jira/browse/HDFS-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071751#comment-14071751 ] Uma Maheswara Rao G commented on HDFS-6737: --- I realized while changing the code that, creating the instance with feinfo.getCipherSuite only. So, the cipherSuite what we used should expect the same length as fsInfo.getIV. So, But in FsNamesystem#startFileInternal generating the IV with edek, but do we need to generate the IV as the length of passed suite algorithm block size? If so, I will update JIRA description and post the patch for it. Please correct me if I did not follow. DFSClinet should use IV generated beased on the configured CipherSuite with codecs used --- Key: HDFS-6737 URL: https://issues.apache.org/jira/browse/HDFS-6737 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Seems like we are using IV as like Encrypted data encryption key iv. But the underlying Codec's cipher suite may expect different iv length. So, we should generate IV from the Coec's cipher suite configured. {code} final CryptoInputStream cryptoIn = new CryptoInputStream(dfsis, CryptoCodec.getInstance(conf, feInfo.getCipherSuite()), feInfo.getEncryptedDataEncryptionKey(), feInfo.getIV()); {code} So, instead of using feinfo.getIV(), we should generate like {code} byte[] iv = new byte[codec.getCipherSuite().getAlgorithmBlockSize()]; codec.generateSecureRandom(iv); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071763#comment-14071763 ] Charles Lamb commented on HDFS-6422: The TestOfflineEditsViewer test will continue to fail until we checkin editsStored (the test passes for me on my local machine). The other tests appear to be unrelated. getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist Key: HDFS-6422 URL: https://issues.apache.org/jira/browse/HDFS-6422 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Blocker Attachments: HDFS-6422.005.patch, HDFS-6422.006.patch, HDFS-6422.007.patch, HDFS-6422.008.patch, HDFS-6422.009.patch, HDFS-6422.010.patch, HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch, HDFS-6474.4.patch, editsStored If you do hdfs dfs -getfattr -n user.blah /foo and user.blah doesn't exist, the command prints # file: /foo and a 0 return code. It should print an exception and return a non-0 return code instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071774#comment-14071774 ] Uma Maheswara Rao G commented on HDFS-6422: --- Thanks a lot, Charles for addressing all the feedback. +1, latest patch looks good to me. I will commit the patch shortly. getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist Key: HDFS-6422 URL: https://issues.apache.org/jira/browse/HDFS-6422 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Blocker Attachments: HDFS-6422.005.patch, HDFS-6422.006.patch, HDFS-6422.007.patch, HDFS-6422.008.patch, HDFS-6422.009.patch, HDFS-6422.010.patch, HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch, HDFS-6474.4.patch, editsStored If you do hdfs dfs -getfattr -n user.blah /foo and user.blah doesn't exist, the command prints # file: /foo and a 0 return code. It should print an exception and return a non-0 return code instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6733) Creating encryption zone results in NPE when KeyProvider is null
[ https://issues.apache.org/jira/browse/HDFS-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6733: --- Attachment: HDFS-6733.002.patch Creating encryption zone results in NPE when KeyProvider is null Key: HDFS-6733 URL: https://issues.apache.org/jira/browse/HDFS-6733 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6733.001.patch, HDFS-6733.002.patch When users try to create an encryption zone on a system that is not configured with a KeyProvider, they will run into a NullPointerException. For example: [hdfs@schu-enc2 ~]$ hdfs crypto -createZone -keyName abc123 -path /user/hdfs 2014-07-22 23:18:23,273 WARN [main] crypto.CryptoCodec (CryptoCodec.java:getInstance(70)) - Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. RemoteException: java.lang.NullPointerException This error happens in FSNamesystem.createEncryptionZone(FSNamesystem.java:8456): {code} try { if (keyName == null || keyName.isEmpty()) { keyName = UUID.randomUUID().toString(); createNewKey(keyName, src); createdKey = true; } else { KeyVersion keyVersion = provider.getCurrentKey(keyName); if (keyVersion == null) { {code} provider can be null. An improvement would be to make the error message more specific/say that KeyProvider was not found. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6733) Creating encryption zone results in NPE when KeyProvider is null
[ https://issues.apache.org/jira/browse/HDFS-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071799#comment-14071799 ] Charles Lamb commented on HDFS-6733: Thanks [~andrew.wang] for the review. I moved testCreateEZWithNoProvider into TestEncryptionZones and as you suggested, grabbed the conf from the minicluster, diddled it, restarted the NN, ... I'll commit this to fs-encryption shortly. Creating encryption zone results in NPE when KeyProvider is null Key: HDFS-6733 URL: https://issues.apache.org/jira/browse/HDFS-6733 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6733.001.patch, HDFS-6733.002.patch When users try to create an encryption zone on a system that is not configured with a KeyProvider, they will run into a NullPointerException. For example: [hdfs@schu-enc2 ~]$ hdfs crypto -createZone -keyName abc123 -path /user/hdfs 2014-07-22 23:18:23,273 WARN [main] crypto.CryptoCodec (CryptoCodec.java:getInstance(70)) - Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. RemoteException: java.lang.NullPointerException This error happens in FSNamesystem.createEncryptionZone(FSNamesystem.java:8456): {code} try { if (keyName == null || keyName.isEmpty()) { keyName = UUID.randomUUID().toString(); createNewKey(keyName, src); createdKey = true; } else { KeyVersion keyVersion = provider.getCurrentKey(keyName); if (keyVersion == null) { {code} provider can be null. An improvement would be to make the error message more specific/say that KeyProvider was not found. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6733) Creating encryption zone results in NPE when KeyProvider is null
[ https://issues.apache.org/jira/browse/HDFS-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb resolved HDFS-6733. Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Target Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Committed to fs-encryption. Creating encryption zone results in NPE when KeyProvider is null Key: HDFS-6733 URL: https://issues.apache.org/jira/browse/HDFS-6733 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Stephen Chu Assignee: Charles Lamb Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6733.001.patch, HDFS-6733.002.patch When users try to create an encryption zone on a system that is not configured with a KeyProvider, they will run into a NullPointerException. For example: [hdfs@schu-enc2 ~]$ hdfs crypto -createZone -keyName abc123 -path /user/hdfs 2014-07-22 23:18:23,273 WARN [main] crypto.CryptoCodec (CryptoCodec.java:getInstance(70)) - Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. RemoteException: java.lang.NullPointerException This error happens in FSNamesystem.createEncryptionZone(FSNamesystem.java:8456): {code} try { if (keyName == null || keyName.isEmpty()) { keyName = UUID.randomUUID().toString(); createNewKey(keyName, src); createdKey = true; } else { KeyVersion keyVersion = provider.getCurrentKey(keyName); if (keyVersion == null) { {code} provider can be null. An improvement would be to make the error message more specific/say that KeyProvider was not found. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6703) NFS: Files can be deleted from a read-only mount
[ https://issues.apache.org/jira/browse/HDFS-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071817#comment-14071817 ] Hudson commented on HDFS-6703: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1840/]) HDFS-6703. NFS: Files can be deleted from a read-only mount. Contributed by Srikanth Upputuri (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612702) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestClientAccessPrivilege.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS: Files can be deleted from a read-only mount Key: HDFS-6703 URL: https://issues.apache.org/jira/browse/HDFS-6703 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Abhiraj Butala Assignee: Srikanth Upputuri Fix For: 2.5.0 Attachments: HDFS-6703.patch As reported by bigdatagroup bigdatagr...@itecons.it on hadoop-users mailing list: {code} We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153 ro/value /property As you can see, we expect the exported FS to be read-only, but in fact we are able to delete files and folders stored on it (where the user has the correct permissions), from the client machine that mounted the FS. Other writing operations are correctly blocked. Hadoop Version in use: 2.3.0+cdh5.0.1+567 {code} I was able to reproduce the issue on latest hadoop trunk. Though I could only delete files, deleting directories were correctly blocked: {code} abutala@abutala-vBox:/mnt/hdfs$ mount | grep 127 127.0.1.1:/ on /mnt/hdfs type nfs (rw,vers=3,proto=tcp,nolock,addr=127.0.1.1) abutala@abutala-vBox:/mnt/hdfs$ ls -lh total 512 -rw-r--r-- 1 abutala supergroup 0 Jul 17 18:51 abc.txt drwxr-xr-x 2 abutala supergroup 64 Jul 17 18:31 temp abutala@abutala-vBox:/mnt/hdfs$ rm abc.txt abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ rm -r temp rm: cannot remove `temp': Permission denied abutala@abutala-vBox:/mnt/hdfs$ ls temp abutala@abutala-vBox:/mnt/hdfs$ {code} Contents of hdfs-site.xml: {code} configuration property namedfs.nfs3.dump.dir/name value/tmp/.hdfs-nfs3/value /property property namedfs.nfs.exports.allowed.hosts/name valuelocalhost ro/value /property /configuration {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6701) Make seed optional in NetworkTopology#sortByDistance
[ https://issues.apache.org/jira/browse/HDFS-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071821#comment-14071821 ] Hudson commented on HDFS-6701: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1840/]) HDFS-6701. Make seed optional in NetworkTopology#sortByDistance. Contributed by Ashwin Shankar. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612625) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestNetworkTopologyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java Make seed optional in NetworkTopology#sortByDistance Key: HDFS-6701 URL: https://issues.apache.org/jira/browse/HDFS-6701 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.5.0 Reporter: Ashwin Shankar Assignee: Ashwin Shankar Fix For: 2.6.0 Attachments: HDFS-6701-v1.txt, HDFS-6701-v3-branch2.txt, HDFS-6701-v3.txt, HDFS-6701-v4-branch2.txt, HDFS-6701-v4.txt Currently seed in NetworkTopology#sortByDistance is set to the blockid which causes the RNG to generate same pseudo random order for each block. If no node local block location is present,this causes the same rack local replica to be hit for a particular block. It'll be good to make the seed optional, so that one could turn it off if they want block locations of a block to be randomized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6731) Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception.
[ https://issues.apache.org/jira/browse/HDFS-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071823#comment-14071823 ] Hudson commented on HDFS-6731: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1840/]) HDFS-6731. Run 'hdfs zkfc -formatZK' on a server in a non-namenode will cause a null pointer exception. Contributed by Masatake Iwasaki (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612715) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. Key: HDFS-6731 URL: https://issues.apache.org/jira/browse/HDFS-6731 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover, ha Affects Versions: 2.0.4-alpha, 2.4.0 Reporter: WenJin Ma Assignee: Masatake Iwasaki Fix For: 2.6.0 Attachments: HADOOP-9603-0.patch Original Estimate: 168h Remaining Estimate: 168h Run hdfs zkfc-formatZK on a server in a non-namenode will cause a null pointer exception. {code} [hadoop@test bin]$ ./hdfs zkfc -formatZK Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.init(NNHAServiceTarget.java:57) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:128) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:172) {code} I look at the code, found in the org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs () method does not make judgments on this issue. {code} static String[] getSuffixIDs(final Configuration conf, final String addressKey, String knownNsId, String knownNNId, final AddressMatcher matcher) { String nameserviceId = null; String namenodeId = null; int found = 0; //..do something if (found 1) { // Only one address must match the local address String msg = Configuration has multiple addresses that match + local node's address. Please configure the system with + DFS_NAMESERVICE_ID + and + DFS_HA_NAMENODE_ID_KEY; throw new HadoopIllegalArgumentException(msg); } // If the IP is not a local address, found to be less than 1. // There should be throw an exception with clear message rather than cause a null pointer exception. return new String[] { nameserviceId, namenodeId }; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6704) Fix the command to launch JournalNode in HDFS-HA document
[ https://issues.apache.org/jira/browse/HDFS-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071820#comment-14071820 ] Hudson commented on HDFS-6704: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1840/]) HDFS-6704. Fix the command to launch JournalNode in HDFS-HA document. Contributed by Akira AJISAKA. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612613) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm Fix the command to launch JournalNode in HDFS-HA document - Key: HDFS-6704 URL: https://issues.apache.org/jira/browse/HDFS-6704 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie Fix For: 2.6.0 Attachments: HDFS-6704.2.patch, HDFS-6704.patch In HDFSHighAvailabilityWithQJM.html, {code} After all of the necessary configuration options have been set, you must start the JournalNode daemons on the set of machines where they will run. This can be done by running the command hdfs-daemon.sh journalnode and waiting for the daemon to start on each of the relevant machines. {code} hdfs-daemon.sh should be hadoop-daemon.sh since hdfs-daemon.sh does not exist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6712) Document HDFS Multihoming Settings
[ https://issues.apache.org/jira/browse/HDFS-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071822#comment-14071822 ] Hudson commented on HDFS-6712: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1840/]) HDFS-6712. Document HDFS Multihoming Settings. (Contributed by Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612695) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm * /hadoop/common/trunk/hadoop-project/src/site/site.xml Document HDFS Multihoming Settings -- Key: HDFS-6712 URL: https://issues.apache.org/jira/browse/HDFS-6712 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6712.02.patch A few HDFS settings can be changed to enable better support in multi-homed environments. This task is to write a short guide to these settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6738: --- Attachment: HDFS-6738.001.patch Thanks [~umamaheswararao]. The attached trivial patch addresses the issue. Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Attachments: HDFS-6738.001.patch We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6738: --- Target Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Status: Patch Available (was: Open) Compilation issue only so no tests included in this patch. Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Attachments: HDFS-6738.001.patch We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071844#comment-14071844 ] Uma Maheswara Rao G commented on HDFS-5723: --- For the above question which Jing asked should work fine as Vinay answered it. When DN reports with older genstamp, NN will add it to corruptReplicMap and NN ask DN to remove block with older genstamp only when it processes corrupt replicas. DN can not remove it as DN already bumped with genstamp. on next report from DN with newer genstamp, NN will remove from corruptMap if it exist. {code} } else { // if the same block is added again and the replica was corrupt // previously because of a wrong gen stamp, remove it from the // corrupt block list. corruptReplicas.removeFromCorruptReplicasMap(block, node, Reason.GENSTAMP_MISMATCH); curReplicaDelta = 0; blockLog.warn(BLOCK* addStoredBlock: + Redundant addStoredBlock request received for + storedBlock + on + node + size + storedBlock.getNumBytes()); } {code} Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071843#comment-14071843 ] Hadoop QA commented on HDFS-6738: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657363/HDFS-6738.001.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7442//console This message is automatically generated. Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Attachments: HDFS-6738.001.patch We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071846#comment-14071846 ] Uma Maheswara Rao G commented on HDFS-5723: --- +1 latest patch looks good to me. Could please comment about the failure above? Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071848#comment-14071848 ] Uma Maheswara Rao G commented on HDFS-6738: --- +1 Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Attachments: HDFS-6738.001.patch We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6738) Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6738: --- Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Status: Resolved (was: Patch Available) Thanks for the review [~umamaheswararao]. I've committed this to fs-encryption. Remove unnecessary getEncryptionZoneForPath call in EZManager#createEncryptionZone -- Key: HDFS-6738 URL: https://issues.apache.org/jira/browse/HDFS-6738 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6738.001.patch We can remove the following getEncryptionZoneForPath call below. {code} // done this way to handle edit log loading dir.unprotectedSetXAttrs(src, xattrs, EnumSet.of(XAttrSetFlag.CREATE)); ezi = getEncryptionZoneForPath(srcIIP); return ezXAttr; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6737) DFSClinet should use IV generated beased on the configured CipherSuite with codecs used
[ https://issues.apache.org/jira/browse/HDFS-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-6737: -- Attachment: HDFS-6737.patch Attaching a patch as per my comment above to make it clear. DFSClinet should use IV generated beased on the configured CipherSuite with codecs used --- Key: HDFS-6737 URL: https://issues.apache.org/jira/browse/HDFS-6737 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-6737.patch Seems like we are using IV as like Encrypted data encryption key iv. But the underlying Codec's cipher suite may expect different iv length. So, we should generate IV from the Coec's cipher suite configured. {code} final CryptoInputStream cryptoIn = new CryptoInputStream(dfsis, CryptoCodec.getInstance(conf, feInfo.getCipherSuite()), feInfo.getEncryptedDataEncryptionKey(), feInfo.getIV()); {code} So, instead of using feinfo.getIV(), we should generate like {code} byte[] iv = new byte[codec.getCipherSuite().getAlgorithmBlockSize()]; codec.generateSecureRandom(iv); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6509) create a /.reserved/raw filesystem namespace
[ https://issues.apache.org/jira/browse/HDFS-6509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6509: --- Attachment: (was: HDFS-6509.001.patch) create a /.reserved/raw filesystem namespace Key: HDFS-6509 URL: https://issues.apache.org/jira/browse/HDFS-6509 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6509distcpandDataatRestEncryption-2.pdf, HDFS-6509distcpandDataatRestEncryption.pdf This is part of the work for making distcp work with Data at Rest Encryption. Per the attached document, create a /.reserved/raw HDFS filesystem namespace that allows access to the encrypted bytes of a file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction
[ https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071910#comment-14071910 ] Vinayakumar B commented on HDFS-5723: - Thanks [~umamaheswararao] for the explanation above and for reviewing the patch. I am very confident that Test failure is not related to this patch. Its observed in many of the QA builds today. Need to do an investigation of that. I will verify it and create a ticket if any issue observed. Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction Key: HDFS-5723 URL: https://issues.apache.org/jira/browse/HDFS-5723 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5723.patch, HDFS-5723.patch, HDFS-5723.patch Scenario: 1. 3 node cluster with dfs.client.block.write.replace-datanode-on-failure.enable set to false. 2. One file is written with 3 replicas, blk_id_gs1 3. One of the datanode DN1 is down. 4. File was opened with append and some more data is added to the file and synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2 5. Now DN1 restarted 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should be marked corrupted. but since NN having appended block state as UnderConstruction, at this time its not detecting this block as corrupt and adding to valid block locations. As long as the namenode is alive, this datanode also will be considered as valid replica and read/append will fail in that datanode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6709) Implement off-heap data structures for NameNode and other HDFS memory optimization
[ https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071964#comment-14071964 ] Daryn Sharp commented on HDFS-6709: --- If {{Unsafe}} is being removed then I don't think we should create a dependency on it. Sadly, while investigating off heap performance last fall, I found this article that claims off-heap reads via a {{DirectByteBuffer}} have *horrible* performance: http://www.javacodegeeks.com/2013/08/which-memory-is-faster-heap-or-bytebuffer-or-direct.html bq. With a hash table and a linked list, we could probably start off-heaping things such as the triplets array in the BlockInfo object. How you do envision off-heaping triplets in conjunction with those collections? Linked list entries cost 48 bytes on a 64-bit jvm. A hash table entry costs 52 bytes. I know your goal is reduced GC while ours is reduced memory usage, so it'll be unacceptable if an off-heap implementation consumes even more memory - which incidentally will require GC and may cancel any off-heap benefit? And/or cause a performance degradation. Implement off-heap data structures for NameNode and other HDFS memory optimization -- Key: HDFS-6709 URL: https://issues.apache.org/jira/browse/HDFS-6709 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6709.001.patch We should investigate implementing off-heap data structures for NameNode and other HDFS memory optimization. These data structures could reduce latency by avoiding the long GC times that occur with large Java heaps. We could also avoid per-object memory overheads and control memory layout a little bit better. This also would allow us to use the JVM's compressed oops optimization even with really large namespaces, if we could get the Java heap below 32 GB for those cases. This would provide another performance and memory efficiency boost. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6509) create a /.reserved/raw filesystem namespace
[ https://issues.apache.org/jira/browse/HDFS-6509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6509: --- Attachment: HDFS-6509.001.patch create a /.reserved/raw filesystem namespace Key: HDFS-6509 URL: https://issues.apache.org/jira/browse/HDFS-6509 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6509.001.patch, HDFS-6509distcpandDataatRestEncryption-2.pdf, HDFS-6509distcpandDataatRestEncryption.pdf This is part of the work for making distcp work with Data at Rest Encryption. Per the attached document, create a /.reserved/raw HDFS filesystem namespace that allows access to the encrypted bytes of a file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6509) create a /.reserved/raw filesystem namespace
[ https://issues.apache.org/jira/browse/HDFS-6509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6509: --- Attachment: HDFS-6509distcpandDataatRestEncryption-3.pdf The design doc has been updated to reflect choice of raw.* extended attribute namespace and the restriction that only the admin can create/access files in /.reserved/raw. create a /.reserved/raw filesystem namespace Key: HDFS-6509 URL: https://issues.apache.org/jira/browse/HDFS-6509 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6509.001.patch, HDFS-6509distcpandDataatRestEncryption-2.pdf, HDFS-6509distcpandDataatRestEncryption-3.pdf, HDFS-6509distcpandDataatRestEncryption.pdf This is part of the work for making distcp work with Data at Rest Encryption. Per the attached document, create a /.reserved/raw HDFS filesystem namespace that allows access to the encrypted bytes of a file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6686) Archival Storage: Use fallback storage types
[ https://issues.apache.org/jira/browse/HDFS-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-6686. --- Resolution: Fixed Fix Version/s: Archival Storage (HDFS-6584) Hadoop Flags: Reviewed I have committed this. Archival Storage: Use fallback storage types Key: HDFS-6686 URL: https://issues.apache.org/jira/browse/HDFS-6686 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: Archival Storage (HDFS-6584) Attachments: h6686_20140721.patch, h6686_20140721c.patch, h6686_20140723.patch HDFS-6671 changes replication monitor to use block storage policy for replication. It should also use the fallback storage types when a particular type of storage is full. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-448) An IOException thrown in the BlockReceiver file causes some tests to hang
[ https://issues.apache.org/jira/browse/HDFS-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-448. --- Resolution: Fixed Release Note: Stale. An IOException thrown in the BlockReceiver file causes some tests to hang - Key: HDFS-448 URL: https://issues.apache.org/jira/browse/HDFS-448 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 0.20.1 Reporter: Konstantin Boudnik Attachments: TestFileCreation.log, jstack.log, log.exception Using FI framework (HADOOP-6003) I'm injecting some faults into BlockReceiver class. When DiskOutOfSpaceException is thrown at at BlockReceiver.java:449 it causes the very first test case of org.apache.hadoop.hdfs.TestFileCreation to hand indefinitely. I still am not clear if this a test bug (hmm...) or yet another issues in HDFS. Thus I'm filing this bug against test component for now. In the attachments: test run log and jstack log of the running VM under test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6509) create a /.reserved/raw filesystem namespace
[ https://issues.apache.org/jira/browse/HDFS-6509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6509: --- Attachment: (was: HDFS-6509.001.patch) create a /.reserved/raw filesystem namespace Key: HDFS-6509 URL: https://issues.apache.org/jira/browse/HDFS-6509 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6509distcpandDataatRestEncryption-2.pdf, HDFS-6509distcpandDataatRestEncryption-3.pdf, HDFS-6509distcpandDataatRestEncryption.pdf This is part of the work for making distcp work with Data at Rest Encryption. Per the attached document, create a /.reserved/raw HDFS filesystem namespace that allows access to the encrypted bytes of a file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6665) Add tests for XAttrs in combination with viewfs
[ https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072049#comment-14072049 ] Stephen Chu commented on HDFS-6665: --- I've submitted a patch for HADOOP-10887, similar to what we did for ACLs in HADOOP-10845. Once that's resolved, we can add HDFS tests for XAttrs + ViewFileSystem and ViewFs. Add tests for XAttrs in combination with viewfs --- Key: HDFS-6665 URL: https://issues.apache.org/jira/browse/HDFS-6665 Project: Hadoop HDFS Issue Type: Test Components: hdfs-client Affects Versions: 2.5.0 Reporter: Stephen Chu Assignee: Stephen Chu This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs) We should verify that XAttr operations work properly with viewfs, and that XAttr commands are routed to the correct namenode in a federated deployment. Also, we should make sure that the behavior of XAttr commands on internal dirs is consistent with other commands. For example, setPermission will throw the readonly AccessControlException for paths above the root mount entry. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072061#comment-14072061 ] Arpit Agarwal commented on HDFS-6719: - Nitpick, test_root need not be static. Also if you post another patch you could consider just using the full test name for the directory instead of testdfv. +1 with that fixed. Separately we can check other tests for misuse of test.build.data. Quick grep gave the following: {code} FSTestWrapper.java FileContextMainOperationsBaseTest.java FileContextTestHelper.java FileContextURIBase.java FileSystemTestHelper.java MiniDFSCluster.java TestBlocksWithNotEnoughRacks.java TestChecksumFileSystem.java TestCopyPreserveFlag.java TestCreateEditsLog.java TestDFSUpgradeFromImage.java TestDecommissioningStatus.java TestEnhancedByteBufferAccess.java TestFSImageWithSnapshot.java TestFileUtil.java TestFsShellReturnCode.java TestHadoopArchives.java TestHarFileSystemBasics.java TestHardLink.java TestHdfsTextCommand.java TestHostsFiles.java TestJHLA.java TestListFiles.java TestLocalFileSystem.java TestNameNodeRecovery.java TestNativeIO.java TestPathData.java TestPread.java TestRenameWithSnapshots.java TestSeekBug.java TestSlive.java TestSnapshot.java TestStartup.java TestTextCommand.java {code} org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-450) eclipse-files target does not create HDFS_Ant_Builder
[ https://issues.apache.org/jira/browse/HDFS-450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-450. --- Resolution: Fixed We stepped on all of our ants. Closing. eclipse-files target does not create HDFS_Ant_Builder - Key: HDFS-450 URL: https://issues.apache.org/jira/browse/HDFS-450 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.21.0 Reporter: Konstantin Shvachko This is the result of project splitting. The eclipse-files build target used to create Hadoop_Ant_Builder - an ant based builder for eclipse. The target still works fine for hadoop/common, but not for HDFS. I think for hdfs the builder should be called HDFS_Ant_Builder. I did not check map-reduce. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-452) Fs -count throws Access Control Exception when restricted acess directories are present in the user's directory .
[ https://issues.apache.org/jira/browse/HDFS-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-452: -- Labels: newbie (was: ) Fs -count throws Access Control Exception when restricted acess directories are present in the user's directory . --- Key: HDFS-452 URL: https://issues.apache.org/jira/browse/HDFS-452 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1 Reporter: Ravi Phulari Labels: newbie If there are any directories with restricted access to user are present in users directory then fs -count operation on users directory throws Access control exception . e.g {code} [rphulari@some-host ~]$ hadoop fs -ls /user/ | grep rphulari drwxr--r-- - rphulariusers0 2009-06-09 22:05 /user/rphulari [rphulari@some-host ~]$ hadoop fs -ls Found 3 items drwx-- - hdfs users 0 2009-04-17 01:11 /user/rphulari/temp drwxr--r-- - rphulari users 0 2009-05-06 22:02 /user/rphulari/temp2 -rw-r--r-- 3 rphulari users 0 2009-05-06 22:11 /user/rphulari/test [rphulari@some-host ~]$ hadoop fs -count /user/rphulari count: org.apache.hadoop.security.AccessControlException: Permission denied: user=rphulari, access=READ_EXECUTE, inode=temp:hdfs:users:rwx-- {code} Ideal out put should be quota information about user owned dir/files and error notification about files/dir which are not owned by user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-452) Fs -count throws Access Control Exception when restricted acess directories are present in the user's directory .
[ https://issues.apache.org/jira/browse/HDFS-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072095#comment-14072095 ] Allen Wittenauer commented on HDFS-452: --- From trunk as of 8a795812e79033cb5b91389cc0e7e110deacbe0e {code} hdfs dfs -count / count: Permission denied: user=root, access=READ_EXECUTE, inode=/user:aw:hdfs:drwx-- hdfs dfs -du / du: Permission denied: user=root, access=READ_EXECUTE, inode=/copy:aw:hdfs:drwxr-x--- du: Permission denied: user=root, access=READ_EXECUTE, inode=/system:aw:hdfs:drwxr-x--- du: Permission denied: user=root, access=READ_EXECUTE, inode=/user:aw:hdfs:drwx-- 0 /benchmarks {code} Fs -count throws Access Control Exception when restricted acess directories are present in the user's directory . --- Key: HDFS-452 URL: https://issues.apache.org/jira/browse/HDFS-452 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1 Reporter: Ravi Phulari Labels: newbie If there are any directories with restricted access to user are present in users directory then fs -count operation on users directory throws Access control exception . e.g {code} [rphulari@some-host ~]$ hadoop fs -ls /user/ | grep rphulari drwxr--r-- - rphulariusers0 2009-06-09 22:05 /user/rphulari [rphulari@some-host ~]$ hadoop fs -ls Found 3 items drwx-- - hdfs users 0 2009-04-17 01:11 /user/rphulari/temp drwxr--r-- - rphulari users 0 2009-05-06 22:02 /user/rphulari/temp2 -rw-r--r-- 3 rphulari users 0 2009-05-06 22:11 /user/rphulari/test [rphulari@some-host ~]$ hadoop fs -count /user/rphulari count: org.apache.hadoop.security.AccessControlException: Permission denied: user=rphulari, access=READ_EXECUTE, inode=temp:hdfs:users:rwx-- {code} Ideal out put should be quota information about user owned dir/files and error notification about files/dir which are not owned by user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-460) Expose NN and DN hooks to service plugins
[ https://issues.apache.org/jira/browse/HDFS-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-460. --- Resolution: Won't Fix Closing this as won't fix and leaving it at that. Expose NN and DN hooks to service plugins - Key: HDFS-460 URL: https://issues.apache.org/jira/browse/HDFS-460 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Todd Lipcon Assignee: Todd Lipcon This is the other half of the old HADOOP-5640 (Allow ServicePlugins to hook callbacks into key service events). It adds hooks to the NN and DN to expose certain events to plugins. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6711) FSNamesystem#getAclStatus does not write to the audit log.
[ https://issues.apache.org/jira/browse/HDFS-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072109#comment-14072109 ] Chris Nauroth commented on HDFS-6711: - I'm linking this to HDFS-5730, which seeks to make the audit logging policy consistent across all APIs. The decisions in that issue will influence our choice for how to fix this bug. FSNamesystem#getAclStatus does not write to the audit log. -- Key: HDFS-6711 URL: https://issues.apache.org/jira/browse/HDFS-6711 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Priority: Minor Consider writing an event to the audit log for the {{getAclStatus}} method. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5730) Inconsistent Audit logging for HDFS APIs
[ https://issues.apache.org/jira/browse/HDFS-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072115#comment-14072115 ] Hadoop QA commented on HDFS-5730: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623677/HDFS-5730.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7443//console This message is automatically generated. Inconsistent Audit logging for HDFS APIs Key: HDFS-5730 URL: https://issues.apache.org/jira/browse/HDFS-5730 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.2.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-5730.patch, HDFS-5730.patch When looking at the audit loggs in HDFS, I am seeing some inconsistencies what was logged with audit and what is added recently. For more details please check the comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072120#comment-14072120 ] Yongjun Zhang commented on HDFS-6719: - HI [~arpitagarwal], thanks a lot for the review! Uploaded a new patch (006) here to change the dir name to full test name. I can't change the test_root to non-static, because it's referenced in static inner class XXDF. Filed HADOOP-10889 for the good suggestions you made to fix misuse of test.build.data. org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch, HDFS-6719.006.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6719) org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072129#comment-14072129 ] Arpit Agarwal commented on HDFS-6719: - Sorry I missed that, +1 for the patch. Will commit it shortly. Thanks Yongjun! org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist -- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch, HDFS-6719.006.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6570) add api that enables checking if a user has certain permissions on a file
[ https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072126#comment-14072126 ] Chris Nauroth commented on HDFS-6570: - Jitendra, thanks for incorporating the feedback. I think this is almost ready. I see just one more thing to fix, and I have recommendations on a few more test cases to add. I expect the patch is already correct for all of these suggested test cases, so adding them would just be helpful for preventing regressions in the future. # {{GetOpParam}}: It looks like the convention on WebHDFS operation names is to put all the words together, not separated by underscore. Let's change {{CHECK_ACCESS}} to {{CHECKACCESS}}. This is actually how you named the operation in the docs already. # {{TestPermissionSymlinks}}: Let's add a test asserting that a call to check access for a symlink checks the permissions of its target. (Symlinks always have 777, so it wouldn't be correct to check the symlink inode directly.) # {{TestSafeMode#testOperationsWhileInSafeMode}}: Let's make a small change here to add a call to check access while in safe mode. This is a read-only operation, so we expect it to work during safe mode. # {{TestAclWithSnapshot}}: If there is a snapshot, and the original inode's permissions change, then checking access on the snapshot inode must still enforce the old permissions, and checking access on the current version of the inode must reflect the changes. I think the current patch does this correctly, but let's test to make sure. Snapshot tests like this need a lot of setup, so I recommend we just add a few quick access check calls to the 4 existing {{testOriginalAclEnforced*}} tests in this suite. That way, we can get a free ride on the setup code that's already done here. :-) # BTW, I agree with what you did for audit logging in this version of the patch. HDFS-5730 has more discussion on making audit logging consistent across all APIs. bq. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: These look like spurious test failures. They passed for me locally. add api that enables checking if a user has certain permissions on a file - Key: HDFS-6570 URL: https://issues.apache.org/jira/browse/HDFS-6570 Project: Hadoop HDFS Issue Type: Bug Reporter: Thejas M Nair Assignee: Jitendra Nath Pandey Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, HDFS-6570.3.patch For some of the authorization modes in Hive, the servers in Hive check if a given user has permissions on a certain file or directory. For example, the storage based authorization mode allows hive table metadata to be modified only when the user has access to the corresponding table directory on hdfs. There are likely to be such use cases outside of Hive as well. HDFS does not provide an api for such checks. As a result, the logic to check if a user has permissions on a directory gets replicated in Hive. This results in duplicate logic and there introduces possibilities for inconsistencies in the interpretation of the permission model. This becomes a bigger problem with the complexity of ACL logic. HDFS should provide an api that provides functionality that is similar to access function in unistd.h - http://linux.die.net/man/2/access . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6719) TestDFVariations.testMount fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6719: Summary: TestDFVariations.testMount fails intermittently (was: org.apache.hadoop.fs.TestDFVariations.testMount failed intermittently because specified path doesn't exist) TestDFVariations.testMount fails intermittently --- Key: HDFS-6719 URL: https://issues.apache.org/jira/browse/HDFS-6719 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6719.001.dbg.patch, HDFS-6719.002.dbg.patch, HDFS-6719.003.dbg.patch, HDFS-6719.004.dbg.patch, HDFS-6719.005.patch, HDFS-6719.006.patch Failure message: {code} Error Message Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist Stacktrace java.io.FileNotFoundException: Specified path /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-common-project/hadoop-common/target/test/datadoes not exist at org.apache.hadoop.fs.DF.getMount(DF.java:109) at org.apache.hadoop.fs.TestDFVariations.testMount(TestDFVariations.java:54) Standard Output java.io.IOException: Fewer lines of output than expected: Filesystem 1K-blocks Used Available Use% Mounted on java.io.IOException: Unexpected empty line java.io.IOException: Could not parse line:19222656 10597036 7649060 59% / {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-6422: -- Resolution: Fixed Fix Version/s: 2.5.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have just committed this to trunk and branch-2. Thanks a lot Charles for the patch. getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist Key: HDFS-6422 URL: https://issues.apache.org/jira/browse/HDFS-6422 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6422.005.patch, HDFS-6422.006.patch, HDFS-6422.007.patch, HDFS-6422.008.patch, HDFS-6422.009.patch, HDFS-6422.010.patch, HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch, HDFS-6474.4.patch, editsStored If you do hdfs dfs -getfattr -n user.blah /foo and user.blah doesn't exist, the command prints # file: /foo and a 0 return code. It should print an exception and return a non-0 return code instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072176#comment-14072176 ] Hudson commented on HDFS-6422: -- FAILURE: Integrated in Hadoop-trunk-Commit #5953 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5953/]) HDFS-6422. getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist. (Charles Lamb via umamahesh) (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612922) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrPermissionFilter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/XAttrNameParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSXAttrBaseTest.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeRetryCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/resources/TestParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist Key: HDFS-6422 URL: https://issues.apache.org/jira/browse/HDFS-6422 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6422.005.patch, HDFS-6422.006.patch, HDFS-6422.007.patch, HDFS-6422.008.patch, HDFS-6422.009.patch, HDFS-6422.010.patch, HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch, HDFS-6474.4.patch, editsStored If you do hdfs dfs -getfattr -n user.blah /foo and user.blah doesn't exist, the command prints # file: /foo and a 0 return code. It should print an exception and return a non-0 return code instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command
[ https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-488: -- Labels: newbie (was: ) Implement moveToLocal HDFS command --- Key: HDFS-488 URL: https://issues.apache.org/jira/browse/HDFS-488 Project: Hadoop HDFS Issue Type: Bug Reporter: Ravi Phulari Assignee: Ravi Phulari Labels: newbie Surprisingly executing HDFS FsShell command -moveToLocal outputs - Option '-moveToLocal' is not implemented yet. {code} statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t Option '-moveToLocal' is not implemented yet. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-488) Implement moveToLocal HDFS command
[ https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072191#comment-14072191 ] Allen Wittenauer commented on HDFS-488: --- {code} $ bin/hadoop fs -moveToLocal moveToLocal: Option '-moveToLocal' is not implemented yet. {code} This is still broken in 3.x trunk. Implement moveToLocal HDFS command --- Key: HDFS-488 URL: https://issues.apache.org/jira/browse/HDFS-488 Project: Hadoop HDFS Issue Type: Bug Reporter: Ravi Phulari Assignee: Ravi Phulari Labels: newbie Surprisingly executing HDFS FsShell command -moveToLocal outputs - Option '-moveToLocal' is not implemented yet. {code} statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t Option '-moveToLocal' is not implemented yet. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6709) Implement off-heap data structures for NameNode and other HDFS memory optimization
[ https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072197#comment-14072197 ] Colin Patrick McCabe commented on HDFS-6709: If Unsafe is removed, then we'll work around it the same way we work around lack of symlink or hardlink support, missing error information from mkdir, etc. As you can see in this patch, we don't need Unsafe, we just use it because it's faster. I would assume that if Unsafe is removed, there will be work on improving DirectByteBuffer and JNI performance or putting in place other alternate APIs that allow Java to function effectively on the server. Otherwise, the future of the platform doesn't look good. Even Haskell has an Unsafe package. bq. How you do envision off-heaping triplets in conjunction with those collections? Linked list entries cost 48 bytes on a 64-bit jvm. A hash table entry costs 52 bytes. I know your goal is reduced GC while ours is reduced memory usage, so it'll be unacceptable if an off-heap implementation consumes even more memory - which incidentally will require GC and may cancel any off-heap benefit? And/or cause a performance degradation. With off-heap objects, the sizes can be whatever we want. I think a basic linked list entry would be 16 bytes (two 8-byte prev and next pointers), plus the size of the payload. A hash table entry has no real minimum size, since again, it's just a memory region that contains whatever we want. We will be able to do a lot better than the JVM because of a few things: * the jvm must store runtime type information (RTTI) for each object, and we won't * the 64-bit jvm usually aligns to 8 bytes, but we don't have to * we don't have to implement a lock bit, or any of that * we can use value types, and current JVMs can't (although future ones will be able to) * the JVM doesn't know that you will create 1 million of an object; it just creates a generic object layout that must balance access speed and object size. Since we know, we can be more clever. Implement off-heap data structures for NameNode and other HDFS memory optimization -- Key: HDFS-6709 URL: https://issues.apache.org/jira/browse/HDFS-6709 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6709.001.patch We should investigate implementing off-heap data structures for NameNode and other HDFS memory optimization. These data structures could reduce latency by avoiding the long GC times that occur with large Java heaps. We could also avoid per-object memory overheads and control memory layout a little bit better. This also would allow us to use the JVM's compressed oops optimization even with really large namespaces, if we could get the Java heap below 32 GB for those cases. This would provide another performance and memory efficiency boost. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command
[ https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-488: -- Attachment: Screen Shot 2014-07-23 at 12.28.23 PM 1.png ... and the documentation for this is just sad. Implement moveToLocal HDFS command --- Key: HDFS-488 URL: https://issues.apache.org/jira/browse/HDFS-488 Project: Hadoop HDFS Issue Type: Bug Reporter: Ravi Phulari Assignee: Ravi Phulari Labels: newbie Attachments: Screen Shot 2014-07-23 at 12.28.23 PM 1.png Surprisingly executing HDFS FsShell command -moveToLocal outputs - Option '-moveToLocal' is not implemented yet. {code} statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t Option '-moveToLocal' is not implemented yet. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6114) Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr
[ https://issues.apache.org/jira/browse/HDFS-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072218#comment-14072218 ] Colin Patrick McCabe commented on HDFS-6114: +1. Thanks, Vinayakumar. Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr Key: HDFS-6114 URL: https://issues.apache.org/jira/browse/HDFS-6114 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.3.0, 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch 1. {{BlockPoolSliceScanner#scan()}} will not return until all the blocks are scanned. 2. If the blocks (with size in several MBs) to datanode are written continuously then one iteration of {{BlockPoolSliceScanner#scan()}} will be continously scanning the blocks 3. These blocks will be deleted after some time (enough to get block scanned) 4. As Block Scanning is throttled, So verification of all blocks will take so much time. 5. Rolling will never happen, so even though the total number of blocks in datanode doesn't increases, entries ( which contains stale entries of deleted blocks) in *dncp_block_verification.log.curr* continuously increases leading to huge size. In one of our env, it grown more than 1TB where total number of blocks were only ~45k. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072229#comment-14072229 ] Colin Patrick McCabe commented on HDFS-6698: Changing the locking model for the DFSInputStream seems like a big project. Can we have a design doc for this and for HDFS-6735? try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt, HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072228#comment-14072228 ] Colin Patrick McCabe commented on HDFS-6735: Changing the locking model for the DFSInputStream seems like a big project. Can we have a design doc for this and for HDFS-6698? I'm also not sure if we document the thread-safety guarantees offered by the DFSInputStream anywhere. Most things seem to be protected by locks, but we should discuss what the guarantees are and put them as comments in the code explicitly. We should figure out what the thread-safety guarantees are (and which operations block which other operations). For example, a non-positional read probably always has to block another non-positional read, but the situation with other ops is less clear. A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-494) TestHDFSCLI tests FsShell, which is in common, causing a cross-project dependency
[ https://issues.apache.org/jira/browse/HDFS-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-494: -- Labels: newbie (was: ) TestHDFSCLI tests FsShell, which is in common, causing a cross-project dependency - Key: HDFS-494 URL: https://issues.apache.org/jira/browse/HDFS-494 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Jakob Homan Labels: newbie Currently FsShell (in Common) is mainly exercised via TestHDFSCLI, which is in HDFS. This means that changes to FsShell have to be tested in a different project, as what happened with HADOOP-6139 and HDFS-489. This creates a problem as once a patch to FsShell is done, there can be a time when TestHDFSCLI is broken since it's not caught up, which is what happend with these patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-494) TestHDFSCLI tests FsShell, which is in common, causing a cross-project dependency
[ https://issues.apache.org/jira/browse/HDFS-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-494: -- Component/s: test TestHDFSCLI tests FsShell, which is in common, causing a cross-project dependency - Key: HDFS-494 URL: https://issues.apache.org/jira/browse/HDFS-494 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Jakob Homan Labels: newbie Currently FsShell (in Common) is mainly exercised via TestHDFSCLI, which is in HDFS. This means that changes to FsShell have to be tested in a different project, as what happened with HADOOP-6139 and HDFS-489. This creates a problem as once a patch to FsShell is done, there can be a time when TestHDFSCLI is broken since it's not caught up, which is what happend with these patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072261#comment-14072261 ] Haohui Mai commented on HDFS-6657: -- {code} body script type=text/javascript //![CDATA[ window.location.href='dfshealth.html'; //]] /script h1Hadoop Administration/h1 ul lia href=dfshealth.jspDFS Health/Status/a/li /ul /body {code} Maybe we can get rid of the code above as well? Remove link to 'Legacy UI' in trunk's Namenode UI - Key: HDFS-6657 URL: https://issues.apache.org/jira/browse/HDFS-6657 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: HDFS-6657.patch Link to 'Legacy UI' provided on namenode's UI. Since in trunk, all jsp pages are removed, these links will not work. can be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-515) test-patch doesn't work on git checkouts
[ https://issues.apache.org/jira/browse/HDFS-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-515. --- Resolution: Fixed Likely stale. test-patch doesn't work on git checkouts Key: HDFS-515 URL: https://issues.apache.org/jira/browse/HDFS-515 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Jakob Homan Currently test-patch doesn't work on the source trees checked out via git. This is because svn will remotely fetch the test-patch.sh from the common project, but git doesn't do this. Not sure if it's possible to have git do the same thing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6739) Add getDatanodeStorageReport to ClientProtocol
Tsz Wo Nicholas Sze created HDFS-6739: - Summary: Add getDatanodeStorageReport to ClientProtocol Key: HDFS-6739 URL: https://issues.apache.org/jira/browse/HDFS-6739 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze ClientProtocol has a getDatanodeReport(..) methods for retrieving datanode report from namenode. However, there is no way to get datanode storage report. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6739) Add getDatanodeStorageReport to ClientProtocol
[ https://issues.apache.org/jira/browse/HDFS-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6739: -- Attachment: h6739_20140724.patch h6739_20140724.patch: adds getDatanodeStorageReport(..) Add getDatanodeStorageReport to ClientProtocol -- Key: HDFS-6739 URL: https://issues.apache.org/jira/browse/HDFS-6739 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6739_20140724.patch ClientProtocol has a getDatanodeReport(..) methods for retrieving datanode report from namenode. However, there is no way to get datanode storage report. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6739) Add getDatanodeStorageReport to ClientProtocol
[ https://issues.apache.org/jira/browse/HDFS-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6739: -- Status: Patch Available (was: Open) Add getDatanodeStorageReport to ClientProtocol -- Key: HDFS-6739 URL: https://issues.apache.org/jira/browse/HDFS-6739 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6739_20140724.patch ClientProtocol has a getDatanodeReport(..) methods for retrieving datanode report from namenode. However, there is no way to get datanode storage report. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072305#comment-14072305 ] stack commented on HDFS-6698: - bq. Changing the locking model for the DFSInputStream seems like a big project. [~cmccabe] The patch attached here seems to respect the existing locking model and conditions that prefix length calculations. Or are you thinking that we close this issue and do all synchronization changes in HDFS-6735, all in the one go? Thanks. try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt, HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072325#comment-14072325 ] Colin Patrick McCabe commented on HDFS-4257: Nicholas, thank you for looking at this. I can tell there have been a lot of JIRAs about this problem (HDFS-3091, HDFS-3179, HDFS-5131, and HDFS-4600 are all somewhat related). The basic problem that seems to happen a lot is: 1. Client loses network connectivity 2. The client tries to write. But because it can't see anyone else in the network, it can only write to 1 replica at most. 3. The pipeline recovery code throws a hard error because it can't get 3 replicas. 4. Client gets a write error and tries to close the file. That just gives another error. The client goes into a bad state. Sometimes the client continues trying to close the file and continues getting an exception (although this behavior was changed recently). Due to HDFS-4504, the file never gets cleaned up on the NameNode if the client is long-lived. HBase and Flume are both long-lived clients that have the problem with HDFS-4504. HBase avoids this particular problem by not using the HDFS pipeline recovery code, but simply doing their own thing by checking the current number of replicas. So they never get to step #3 because the pipeline recovery is turned off. For Flume, though, this is a major problem. The approach in this patch seems to be that instead of throwing a hard error in step #3, the DFSClient should simply accept only having 1 replica. This will certainly fix the problem for Flume. But imagine the following scenario: 1. Client loses network connectivity 2. The client tries to write. But because it can't see anyone else in the network, it can only write to 1 replica at most. 3. The pipeline recovery code accepts only using 1 local replica 4. The client gets network connectivity back 5. A long time passes 6. The hard disks on the client node go down. In this scenario, we lose the data after step #6. The problem is that while the latest replica is under construction, we won't try to replicate it to other nodes, even though the network is back. If we had a background thread that tried to repair the pipeline in step #5, we could avoid this problem. Another possibility is that instead of throwing an error or continuing in step #3, we could simply wait for a configurable period (after logging a message). The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6114) Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr
[ https://issues.apache.org/jira/browse/HDFS-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6114: --- Resolution: Fixed Fix Version/s: 2.6.0 Target Version/s: 2.6.0 Status: Resolved (was: Patch Available) Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr Key: HDFS-6114 URL: https://issues.apache.org/jira/browse/HDFS-6114 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.3.0, 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch 1. {{BlockPoolSliceScanner#scan()}} will not return until all the blocks are scanned. 2. If the blocks (with size in several MBs) to datanode are written continuously then one iteration of {{BlockPoolSliceScanner#scan()}} will be continously scanning the blocks 3. These blocks will be deleted after some time (enough to get block scanned) 4. As Block Scanning is throttled, So verification of all blocks will take so much time. 5. Rolling will never happen, so even though the total number of blocks in datanode doesn't increases, entries ( which contains stale entries of deleted blocks) in *dncp_block_verification.log.curr* continuously increases leading to huge size. In one of our env, it grown more than 1TB where total number of blocks were only ~45k. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6114) Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr
[ https://issues.apache.org/jira/browse/HDFS-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072339#comment-14072339 ] Hudson commented on HDFS-6114: -- FAILURE: Integrated in Hadoop-trunk-Commit #5954 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5954/]) HDFS-6114. Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr (vinayakumarb via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612943) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr Key: HDFS-6114 URL: https://issues.apache.org/jira/browse/HDFS-6114 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.3.0, 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Fix For: 2.6.0 Attachments: HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch, HDFS-6114.patch 1. {{BlockPoolSliceScanner#scan()}} will not return until all the blocks are scanned. 2. If the blocks (with size in several MBs) to datanode are written continuously then one iteration of {{BlockPoolSliceScanner#scan()}} will be continously scanning the blocks 3. These blocks will be deleted after some time (enough to get block scanned) 4. As Block Scanning is throttled, So verification of all blocks will take so much time. 5. Rolling will never happen, so even though the total number of blocks in datanode doesn't increases, entries ( which contains stale entries of deleted blocks) in *dncp_block_verification.log.curr* continuously increases leading to huge size. In one of our env, it grown more than 1TB where total number of blocks were only ~45k. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072342#comment-14072342 ] Hadoop QA commented on HDFS-4257: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12637004/h4257_20140326.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7446//console This message is automatically generated. The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6455) NFS: Exception should be added in NFS log for invalid separator in allowed.hosts
[ https://issues.apache.org/jira/browse/HDFS-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072347#comment-14072347 ] Brandon Li commented on HDFS-6455: -- Sounds ok to me. +1. NFS: Exception should be added in NFS log for invalid separator in allowed.hosts Key: HDFS-6455 URL: https://issues.apache.org/jira/browse/HDFS-6455 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Yesha Vora Assignee: Abhiraj Butala Attachments: HDFS-6455.002.patch, HDFS-6455.patch The error for invalid separator in dfs.nfs.exports.allowed.hosts property should be added in nfs log file instead nfs.out file. Steps to reproduce: 1. Pass invalid separator in dfs.nfs.exports.allowed.hosts {noformat} propertynamedfs.nfs.exports.allowed.hosts/namevaluehost1 ro:host2 rw/value/property {noformat} 2. restart NFS server. NFS server fails to start and print exception console. {noformat} [hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null host1 sudo su - -c \/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\ hdfs starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Exception in thread main java.lang.IllegalArgumentException: Incorrectly formatted line 'host1 ro:host2 rw' at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356) at org.apache.hadoop.nfs.NfsExports.init(NfsExports.java:151) at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54) at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:176) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.init(Nfs3.java:43) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59) {noformat} NFS log does not print any error message. It directly shuts down. {noformat} STARTUP_MSG: java = 1.6.0_31 / 2014-05-27 18:47:13,972 INFO nfs3.Nfs3Base (SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, HUP, INT] 2014-05-27 18:47:14,169 INFO nfs3.IdUserGroup (IdUserGroup.java:updateMapInternal(159)) - Updated user map size:259 2014-05-27 18:47:14,179 INFO nfs3.IdUserGroup (IdUserGroup.java:updateMapInternal(159)) - Updated group map size:73 2014-05-27 18:47:14,192 INFO nfs3.Nfs3Base (StringUtils.java:run(640)) - SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down Nfs3 at {noformat} NFS.out file has exception. {noformat} EPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Exception in thread main java.lang.IllegalArgumentException: Incorrectly formatted line 'host1 ro:host2 rw' at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356) at org.apache.hadoop.nfs.NfsExports.init(NfsExports.java:151) at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54) at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:176) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.init(Nfs3.java:43) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59) ulimit -a for user hdfs core file size (blocks, -c) 409600 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 188893 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 65536 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)