[jira] [Commented] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106516#comment-14106516 ] Jing Zhao commented on HDFS-6908: - Yeah, I think that is necessary when deleting a snapshot. But when deleting a dir/file from the current fsdir, I guess it should be ok to place {{cleanSubtreeRecursively}} in the end. incorrect snapshot directory diff generated by snapshot deletion Key: HDFS-6908 URL: https://issues.apache.org/jira/browse/HDFS-6908 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Juan Yu Assignee: Juan Yu Priority: Critical Attachments: HDFS-6908.001.patch In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException. 1. create a directory and create a file under it 2. take a snapshot 3. create another file under that directory 4. take second snapshot 5. delete both files and the directory 6. delete second snapshot incorrect directory diff will be generated. Restart NN will throw NPE {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106518#comment-14106518 ] Vinayakumar B commented on HDFS-6886: - I have one comment {code}if (overwrite) { // To remove a file, we need to check 'w' permission of parent checkParentAccess(pc, src, FsAction.WRITE);{code} Here checking permission explicitly for delete not required. As its already would be checked here. {code}if (isPermissionEnabled) { if (overwrite myFile != null) { checkPathAccess(pc, src, FsAction.WRITE); } else { checkAncestorAccess(pc, src, FsAction.WRITE); } }{code} Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu updated HDFS-6908: -- Status: Patch Available (was: Open) incorrect snapshot directory diff generated by snapshot deletion Key: HDFS-6908 URL: https://issues.apache.org/jira/browse/HDFS-6908 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Juan Yu Assignee: Juan Yu Priority: Critical Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException. 1. create a directory and create a file under it 2. take a snapshot 3. create another file under that directory 4. take second snapshot 5. delete both files and the directory 6. delete second snapshot incorrect directory diff will be generated. Restart NN will throw NPE {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu updated HDFS-6908: -- Attachment: HDFS-6908.002.patch incorrect snapshot directory diff generated by snapshot deletion Key: HDFS-6908 URL: https://issues.apache.org/jira/browse/HDFS-6908 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Juan Yu Assignee: Juan Yu Priority: Critical Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException. 1. create a directory and create a file under it 2. take a snapshot 3. create another file under that directory 4. take second snapshot 5. delete both files and the directory 6. delete second snapshot incorrect directory diff will be generated. Restart NN will throw NPE {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106540#comment-14106540 ] Yi Liu commented on HDFS-6886: -- Hi, Thanks [~vinayrpet] for the review (latest patch has few change as following). To create/remove a file, we should check its (ancestor)parent 'w' permission, creating file + overwrite implies the old file will be removed if it exists and then create a new one. Also creating with overwrite requires the path 'w' permission. So we need to do both the two checks. Then for permission check, it's the same logic as original code and HDFS permissions guide (like POSIX mode). {code} if (isPermissionEnabled) { // To remove a file, we need to check 'w' permission of parent checkParentAccess(pc, src, FsAction.WRITE); } {code} Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background
[ https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-6867: Attachment: HDFS-6867-design-20140821.pdf New design document to incorporate the progress on HDFS-3689. The general approach is separated from the choice of using variable sized blocks or not; so it can be assessed independent of the final outcome of HDFS-3689. For DFSOutputStream, do pipeline recovery for a single block in the background -- Key: HDFS-6867 URL: https://issues.apache.org/jira/browse/HDFS-6867 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Attachments: HDFS-6867-design-20140820.pdf, HDFS-6867-design-20140821.pdf For DFSOutputStream, we should be able to do pipeline recovery in the background, while the user is continuing to write to the file. This is especially useful for long-lived clients that write to an HDFS file slowly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106566#comment-14106566 ] Vinayakumar B commented on HDFS-6886: - Got it. :) I think changing permission check in startFileInternal(..) as follows would be applicable for all cases. {code}if (isPermissionEnabled) { if (overwrite myFile != null) { checkPathAccess(pc, src, FsAction.WRITE); } checkAncestorAccess(pc, src, FsAction.WRITE); }{code} checkAncestorAccess(..) this will be needed in both cases of overwrite or new file creation. And if this file already exists, this will check the WRITE permission for the parent. So one more check for deletion not required. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6581) Write to single replica in memory
[ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106575#comment-14106575 ] Arpit Agarwal commented on HDFS-6581: - bq. So I would say, tmpfs is always worse for us. Swapping is just not something we ever want, and memory limits are something we enforce ourselves, so tmpfs's features don't help us. We're in agreement on the relative merits of ramfs vs tmpfs, except that I am assuming performance-sensitive deployments will run with swap disabled, negating the disadvantages of tmpfs. However this is a decision that can be left to the administrator and does not impact the feature design. [~andrew.wang], responses to your questions below. {quote} Related to Colin's point about configuring separate pools of memory on the DN, I'd really like to see integration with the cache pools from HDFS-4949. Memory is ideally shareable between HDFS and YARN, and cache pools were designed with that in mind. Simple storage quotas do not fit as well. Quotas are also a very rigid policy and can result in under-utilization. Cache pools are more flexible, and can be extended to support fair share and more complex policies. Avoiding underutilization seems especially important for a limited resource like memory. {quote} For now, existing diskspace quota checks will apply on block allocation. We cannot skip this check since the blocks are expected to be written to disk in quick order. I agree uniting the RAM disk size and {{dfs.datanode.max.locked.memory}} configurations is desirable. Since tmpfs grows dynamically perhaps one approach is for the DN to limit \[RAM disk + locked memory\] usage to the config value. The recommendation to administrators could be that they set the RAM disk size to the same value as {{dfs.datanode.max.locked.memory}}. This also allows preferential eviction from either cache or tmpfs as desired to keep the total locked memory usage within the limit. I'll need to think through this but I will file a sub-task meanwhile. bq. Do you have any benchmarks? For the read side, we found checksum overhead to be substantial, essentially the cost of a copy. If we use tmpfs, it can swap, so we're forced to calculate checksums at both write and read time. My guess is also that a normal 1-replication write will be fairly fast because of the OS buffer cache, so it'd be nice to quantify the potential improvement. tmpfs has become somewhat of a diversion. Let's assume the administrator configures either ramfs or tmpfs with swap disabled (our implementation doesn't care) so we don't have extra checksum generation beyond what we do today. I would _really_ like to remove even the existing checksum calculation off the write path for replicas that are being written to memory and have DN compute checksums when it 'lazy persists' to disk. I spent way more looking into this than I wanted to and it is hard to do cleanly with the way the write pipeline is setup today - I can explain the details if you are curious. I am wary of significant changes to the write pipeline here but this is the first optimization I want to address after the initial implementation. bq. There's a mention of LAZY_PERSIST having a config option to unlink corrupt TMP files. It seems better for this to be per-file rather than NN-wide, since different clients might want different behavior. That's a good idea, perhaps via an additional flag per-file. Can we leave the system-wide option for the initial implementation and change it going forward? bq. 5.2.2 lists a con of mmaped files as not having control over page writeback. Is this actually true when using mlock? Also not sure why memory pressure is worse with mmaped files compared to tmpfs. mmap might make eviction+SCR nicer too, since you can just drop the mlocks if you want to evict, and the client has a hope of falling back gracefully. Memory pressure is worse with mmaped files because we cannot control the timing of when the pages will be freed. We can evict pages from memory via unmap faster than the memory manager can write them to disk. tmpfs has better characteristics, once we run into the configured limit we can just stop allocating more blocks in memory. A related optimization I'd really like to have is to use unbuffered IO when writing to block files on disk so we don't churn buffer cache. {quote} Caveat, I'm not sure what the HSM APIs will look like, or how this will be integrated, so some of these might be out of scope. Will we support changing a file from DISK storage type to TMP storage type? I would say no, since cache directives seem better for read caching when something is already on disk. Will we support writing a file on both TMP and another storage type? Similar to the above, it also doesn't feel that useful. {quote} We are not setting the storage type on a file. HSM API work (HDFS-5682) has been getting
[jira] [Updated] (HDFS-6799) The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system.
[ https://issues.apache.org/jira/browse/HDFS-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-6799: --- Assignee: (was: Benoy Antony) The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system. --- Key: HDFS-6799 URL: https://issues.apache.org/jira/browse/HDFS-6799 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.4.1 Reporter: Megasthenis Asteris Priority: Minor Attachments: HDFS-6799.patch The invalidate(String bpid, Block[] invalidBlks) method in SimulatedFSDataset.java should remove all invalidBlks from the simulated file system. It currently fails to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6799) The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system.
[ https://issues.apache.org/jira/browse/HDFS-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-6799: --- Assignee: Benoy Antony The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system. --- Key: HDFS-6799 URL: https://issues.apache.org/jira/browse/HDFS-6799 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.4.1 Reporter: Megasthenis Asteris Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6799.patch The invalidate(String bpid, Block[] invalidBlks) method in SimulatedFSDataset.java should remove all invalidBlks from the simulated file system. It currently fails to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6581) Write to single replica in memory
[ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106593#comment-14106593 ] Arpit Agarwal commented on HDFS-6581: - bq. Memory pressure is worse with mmaped files because we cannot control the timing of when the pages will be freed. We can evict pages from memory via unmap faster than the memory manager can write them to disk. tmpfs has better characteristics, once we run into the configured limit we can just stop allocating more blocks in memory. A related optimization I'd really like to have is to use unbuffered IO when writing to block files on disk so we don't churn buffer cache. Also our initial proposal on HDFS-5851 was to use mmapped files. However using a RAM disk allows fairly good control over memory usage on the write path with the least effort. Write to single replica in memory - Key: HDFS-6581 URL: https://issues.apache.org/jira/browse/HDFS-6581 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFSWriteableReplicasInMemory.pdf Per discussion with the community on HDFS-5851, we will implement writing to a single replica in DN memory via DataTransferProtocol. This avoids some of the issues with short-circuit writes, which we can revisit at a later time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106607#comment-14106607 ] Hadoop QA commented on HDFS-6826: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663594/HDFS-6826v7.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 3 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestEditLogRace org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.TestNameNodeRecovery org.apache.hadoop.hdfs.TestHDFSServerPorts org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestFsLimits {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7716//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7716//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7716//console This message is automatically generated. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106627#comment-14106627 ] Yi Liu commented on HDFS-6886: -- Right, agree with that if the file exists, then {{checkAncestorAccess}} is to check the parent. I will update it in next patch version together with update for other guys' comments. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
Hao Chen created HDFS-6914: -- Summary: Resolve huge memory consumption Issue with OIV processing PB-based fsimages Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Fix For: 2.5.1 For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Chen updated HDFS-6914: --- Attachment: HDFS-6914.patch Resolve huge memory consumption Issue with OIV processing PB-based fsimages --- Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Labels: hdfs Fix For: 2.5.1 Attachments: HDFS-6914.patch For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Chen updated HDFS-6914: --- Status: Patch Available (was: Open) Due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by about half. And the solution is very simple. Instead of holding the whole deserialized INode object in memory, this patch just holds the attribute `name` of INode as node of the tree structure in heap. Resolve huge memory consumption Issue with OIV processing PB-based fsimages --- Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Labels: hdfs Fix For: 2.5.1 Attachments: HDFS-6914.patch For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5952) Create a tool to run data analysis on the PB format fsimage
[ https://issues.apache.org/jira/browse/HDFS-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Chen updated HDFS-5952: --- Attachment: HDFS-5952.patch Implement an OIV delimited processor replacement for hadoop version since 2.4.1 that has not been upgraded into 2.5. Create a tool to run data analysis on the PB format fsimage --- Key: HDFS-5952 URL: https://issues.apache.org/jira/browse/HDFS-5952 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0 Reporter: Akira AJISAKA Attachments: HDFS-5952.patch Delimited processor in OfflineImageViewer is not supported after HDFS-5698 was merged. The motivation of delimited processor is to run data analysis on the fsimage, therefore, there might be more values to create a tool for Hive or Pig that reads the PB format fsimage directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6890) NFS readdirplus doesn't return dotdot attributes
[ https://issues.apache.org/jira/browse/HDFS-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106713#comment-14106713 ] Hudson commented on HDFS-6890: -- FAILURE: Integrated in Hadoop-Yarn-trunk #654 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/654/]) HDFS-6890. NFS readdirplus doesn't return dotdot attributes. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619500) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS readdirplus doesn't return dotdot attributes Key: HDFS-6890 URL: https://issues.apache.org/jira/browse/HDFS-6890 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.6.0 Attachments: HDFS-6890.001.patch In RpcProgramNfs3#readdirplus(): {noformat} entries[1] = new READDIRPLUS3Response.EntryPlus3(dotdotFileId, .., dotdotFileId, postOpDirAttr, new FileHandle(dotdotFileId)); {noformat} It should return the directory's parent attribute instead of postOpDirAttr. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6890) NFS readdirplus doesn't return dotdot attributes
[ https://issues.apache.org/jira/browse/HDFS-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106856#comment-14106856 ] Hudson commented on HDFS-6890: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1845 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1845/]) HDFS-6890. NFS readdirplus doesn't return dotdot attributes. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619500) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS readdirplus doesn't return dotdot attributes Key: HDFS-6890 URL: https://issues.apache.org/jira/browse/HDFS-6890 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.6.0 Attachments: HDFS-6890.001.patch In RpcProgramNfs3#readdirplus(): {noformat} entries[1] = new READDIRPLUS3Response.EntryPlus3(dotdotFileId, .., dotdotFileId, postOpDirAttr, new FileHandle(dotdotFileId)); {noformat} It should return the directory's parent attribute instead of postOpDirAttr. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-2975: - Attachment: HDFS-2975.001.patch We need to remove blocks after logSync. Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.24.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-2975: - Status: Patch Available (was: Open) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.24.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-2975: - Target Version/s: 2.6.0 Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-2975: - Affects Version/s: (was: 0.24.0) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106921#comment-14106921 ] Hadoop QA commented on HDFS-6899: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663589/HDFS-6899.04.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7717//console This message is automatically generated. Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6887) beeter performance
[ https://issues.apache.org/jira/browse/HDFS-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated HDFS-6887: -- Priority: Trivial (was: Critical) beeter performance -- Key: HDFS-6887 URL: https://issues.apache.org/jira/browse/HDFS-6887 Project: Hadoop HDFS Issue Type: Wish Components: qjm Affects Versions: 0.23.10 Reporter: ilovehadoop Priority: Trivial Fix For: 0.23.2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6887) beeter performance
[ https://issues.apache.org/jira/browse/HDFS-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved HDFS-6887. --- Resolution: Invalid Hadoop Flags: (was: Incompatible change,Reviewed) beeter performance -- Key: HDFS-6887 URL: https://issues.apache.org/jira/browse/HDFS-6887 Project: Hadoop HDFS Issue Type: Wish Components: qjm Affects Versions: 0.23.10 Reporter: ilovehadoop Priority: Critical Fix For: 0.23.2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6890) NFS readdirplus doesn't return dotdot attributes
[ https://issues.apache.org/jira/browse/HDFS-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106928#comment-14106928 ] Hudson commented on HDFS-6890: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1871 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1871/]) HDFS-6890. NFS readdirplus doesn't return dotdot attributes. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619500) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NFS readdirplus doesn't return dotdot attributes Key: HDFS-6890 URL: https://issues.apache.org/jira/browse/HDFS-6890 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.6.0 Attachments: HDFS-6890.001.patch In RpcProgramNfs3#readdirplus(): {noformat} entries[1] = new READDIRPLUS3Response.EntryPlus3(dotdotFileId, .., dotdotFileId, postOpDirAttr, new FileHandle(dotdotFileId)); {noformat} It should return the directory's parent attribute instead of postOpDirAttr. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6899: Attachment: HDFS-6899.05.patch Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch, HDFS-6899.05.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HDFS-6826: - Attachment: HDFS-6826v7.2.patch new patch cleaning up after testcase. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6915) hadoop fs -text of zero-length file causes EOFException
Eric Payne created HDFS-6915: Summary: hadoop fs -text of zero-length file causes EOFException Key: HDFS-6915 URL: https://issues.apache.org/jira/browse/HDFS-6915 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne List: $ $HADOOP_PREFIX/bin/hadoop fs -ls /user/ericp/foo -rw--- 3 ericp hdfs 0 2014-08-22 16:37 /user/ericp/foo Cat: $ $HADOOP_PREFIX/bin/hadoop fs -cat /user/ericp/foo Text: $ $HADOOP_PREFIX/bin/hadoop fs -text /user/ericp/foo text: java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.fs.shell.Display$Text.getInputStream(Display.java:130) at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:98) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190) at org.apache.hadoop.fs.shell.Command.run(Command.java:154) at org.apache.hadoop.fs.FsShell.run(FsShell.java:287) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:340) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107095#comment-14107095 ] Kihwal Lee commented on HDFS-6914: -- We do this with the fsimage_oiv/oiv_legacy that was added in HDFS-6293. The memory requirement is constantly small regardless of the size of fsimage. Is there any reason not to use it? Resolve huge memory consumption Issue with OIV processing PB-based fsimages --- Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Labels: hdfs Fix For: 2.5.1 Attachments: HDFS-6914.patch For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3875) Issue handling checksum errors in write pipeline
[ https://issues.apache.org/jira/browse/HDFS-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107107#comment-14107107 ] Yongjun Zhang commented on HDFS-3875: - HI [~kihwal], Thanks for your earlier work for this issue. We are seeing a similar problem like this though we have this patch. One question about this patch: Assuming we have a pipeline of three DNs, DN1, DN2, and DN3. DN3 detects a checksum error, and reports back to DN2. DN2 decided to truncate its replica to the acknowledged size by calling {{static private void truncateBlock(File blockFile, File metaFile,}} which reads the data from the local replica file, calculate the checksum for the length to be truncated to, and write the checksum back to the meta file. My question is, when writing back the checksum to the meta file, this method doesn't check against an already computed checksum to see if it matches. However, DN3 does check its computed checksum against the checksum sent from upstream of the pipeline when reporting the checksum mismatch. If DN2 got something wrong in the truncateBlock method (say, for some reason the existing data is corrupted), then DN2 has incorrect cheksum and it's not aware of it. Then later when we try to recover the pipeline, and use DN2 replica as the source, the new DN that receives data from the DN2 will always find checksum error. This is my speculation so far. Do you think this is a possibility? Thanks a lot. Issue handling checksum errors in write pipeline Key: HDFS-3875 URL: https://issues.apache.org/jira/browse/HDFS-3875 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Kihwal Lee Priority: Critical Fix For: 3.0.0, 2.1.0-beta, 0.23.8 Attachments: hdfs-3875-wip.patch, hdfs-3875.branch-0.23.no.test.patch.txt, hdfs-3875.branch-0.23.patch.txt, hdfs-3875.branch-0.23.patch.txt, hdfs-3875.branch-0.23.with.test.patch.txt, hdfs-3875.branch-2.patch.txt, hdfs-3875.patch.txt, hdfs-3875.patch.txt, hdfs-3875.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875.trunk.with.test.patch.txt We saw this issue with one block in a large test cluster. The client is storing the data with replication level 2, and we saw the following: - the second node in the pipeline detects a checksum error on the data it received from the first node. We don't know if the client sent a bad checksum, or if it got corrupted between node 1 and node 2 in the pipeline. - this caused the second node to get kicked out of the pipeline, since it threw an exception. The pipeline started up again with only one replica (the first node in the pipeline) - this replica was later determined to be corrupt by the block scanner, and unrecoverable since it is the only replica -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6729: Attachment: HDFS-6729.001.patch Add a test to check the scenario that NameNode wakes up to check heartbeat in background before DataNode maintenance mode expires. Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6729.000.patch, HDFS-6729.001.patch Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107133#comment-14107133 ] James Thomas commented on HDFS-6800: [~cmccabe], thanks for checking this out. Why couldn't we simply do the delete of the trash dir before the doRollback() call? If the trash removal succeeds and the doRollback() fails, it's not a problem, because we don't need the trash at all in the case where a previous directory is created. After this happens, we may decide to 1) finalize the upgrade instead of rolling it back, in which case we'll simply delete the previous directory and move forward or 2) make a second attempt at the rollback, in which case we'll attempt to delete the nonexistent trash directory (not a problem) and then restore the previous directory in doRollback(). So we're fine in either case. [~arpitagarwal], sorry to keep bothering you, but I was hoping to get the work committed by next week (when my summer internship ends), so it'd be great if you could take a look or direct me to someone else who might be qualified. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107148#comment-14107148 ] Doug Cutting commented on HDFS-3689: bq. Right now these file formats need to accept torn records or add padding. ... or set the block size for the file to something large and start a new file whenever output approaches that, keeping each file in a single (big) block, guaranteeing that no records cross block boundaries. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-4486: Attachment: HDFS-4486-20140822.patch The patch does pass my local unit tests. Not sure why it gives NoClassDef error on the new class I create. Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107175#comment-14107175 ] Haohui Mai commented on HDFS-6914: -- Can you please keep your changes minimal and make the coding style consistent? Resolve huge memory consumption Issue with OIV processing PB-based fsimages --- Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Labels: hdfs Fix For: 2.5.1 Attachments: HDFS-6914.patch For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107176#comment-14107176 ] Owen O'Malley commented on HDFS-3689: - Since this is a discussion of what to put into trunk, incompatible changes aren't a blocker. Furthermore, most clients would never see the difference. Variable length blocks would dramatically improve the ability of HDFS to support better file formats like ORC. On the other hand, I've had very bad experiences with sparse files on Unix. It is all too easy for a user to copy a sparse file and not understand that the copy is 10x larger than the original. That would be *bad* and I do not think that HDFS should support it at all. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107194#comment-14107194 ] Owen O'Malley commented on HDFS-3689: - One follow up is that fixing MapReduce to use the actual block boundaries rather than dividing up the file in fixed size splits would not be difficult and would make the generated file splits for ORC and other block compressed files much much better. Furthermore, note that we could remove the need for lzo and zlib index files for text files by having TextOutputFormat cut the block at a line boundary and flush the compression codec. Thus TextInputFormat could divide the file at block boundaries and have them align at both a compression chunk boundary and a line break. That would be *great*. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107215#comment-14107215 ] Hadoop QA commented on HDFS-6899: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663674/HDFS-6899.05.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure org.apache.hadoop.hdfs.TestSafeMode org.apache.hadoop.hdfs.server.datanode.TestDiskError org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks org.apache.hadoop.hdfs.server.blockmanagement.TestNodeCount org.apache.hadoop.hdfs.TestDatanodeBlockScanner org.apache.hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.TestFileCorruption org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup org.apache.hadoop.hdfs.server.datanode.TestTransferRbw org.apache.hadoop.hdfs.TestFileAppend4 org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7718//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7718//console This message is automatically generated. Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch, HDFS-6899.05.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107214#comment-14107214 ] Hadoop QA commented on HDFS-6826: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663679/HDFS-6826v7.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestEditLogRace org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.TestNameNodeRecovery org.apache.hadoop.hdfs.TestHDFSServerPorts org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.namenode.TestFsLimits {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7719//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7719//console This message is automatically generated. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6829) DFSAdmin refreshSuperUserGroupsConfiguration failed in security cluster
[ https://issues.apache.org/jira/browse/HDFS-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6829: Resolution: Fixed Fix Version/s: 2.6.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I verified the test failures don't repro locally. Committed to trunk and branch-2. Thanks for the contribution [~zhaoyunjiong] and thanks [~jnp] for reviewing. DFSAdmin refreshSuperUserGroupsConfiguration failed in security cluster --- Key: HDFS-6829 URL: https://issues.apache.org/jira/browse/HDFS-6829 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.4.1 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: HDFS-6829.patch When we run command hadoop dfsadmin -refreshSuperUserGroupsConfiguration, it failed and report below message: 14/08/05 21:32:06 WARN security.MultiRealmUserAuthentication: The serverPrincipal = doesn't confirm to the standards refreshSuperUserGroupsConfiguration: null After check the code, I found the bug was triggered by below reasons: 1. We didn't set CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY, which needed by RefreshUserMappingsProtocol. And in DFSAdmin, if no CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY set, it will try to use DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY: conf.set(CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY, conf.get(DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY, )); 2. But we set DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY in hdfs-site.xml 3. DFSAdmin didn't load hdfs-site.xml -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6829) DFSAdmin refreshSuperUserGroupsConfiguration failed in security cluster
[ https://issues.apache.org/jira/browse/HDFS-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107243#comment-14107243 ] Hudson commented on HDFS-6829: -- FAILURE: Integrated in Hadoop-trunk-Commit #6099 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6099/]) HDFS-6829. DFSAdmin refreshSuperUserGroupsConfiguration failed in security cluster. (Contributed by zhaoyunjiong) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619882) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java DFSAdmin refreshSuperUserGroupsConfiguration failed in security cluster --- Key: HDFS-6829 URL: https://issues.apache.org/jira/browse/HDFS-6829 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.4.1 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: HDFS-6829.patch When we run command hadoop dfsadmin -refreshSuperUserGroupsConfiguration, it failed and report below message: 14/08/05 21:32:06 WARN security.MultiRealmUserAuthentication: The serverPrincipal = doesn't confirm to the standards refreshSuperUserGroupsConfiguration: null After check the code, I found the bug was triggered by below reasons: 1. We didn't set CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY, which needed by RefreshUserMappingsProtocol. And in DFSAdmin, if no CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY set, it will try to use DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY: conf.set(CommonConfigurationKeys.HADOOP_SECURITY_SERVICE_USER_NAME_KEY, conf.get(DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY, )); 2. But we set DFSConfigKeys.DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY in hdfs-site.xml 3. DFSAdmin didn't load hdfs-site.xml -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6899: Attachment: HDFS-6899.06.patch Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch, HDFS-6899.05.patch, HDFS-6899.06.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6888) Remove audit logging of getFIleInfo()
[ https://issues.apache.org/jira/browse/HDFS-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107249#comment-14107249 ] Chen He commented on HDFS-6888: --- Hi [~jira.shegalov], you lost me there. We need to let the getfileinfo cmd only log when the auditlog is in debug level, right? Remove audit logging of getFIleInfo() - Key: HDFS-6888 URL: https://issues.apache.org/jira/browse/HDFS-6888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Labels: log Attachments: HDFS-6888-2.patch, HDFS-6888.patch The audit logging of getFileInfo() was added in HDFS-3733. Since this is a one of the most called method, users have noticed that audit log is now filled with this. Since we now have HTTP request logging, this seems unnecessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6865) Byte array native checksumming on client side (HDFS changes)
[ https://issues.apache.org/jira/browse/HDFS-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Thomas updated HDFS-6865: --- Attachment: HDFS-6865.5.patch Fixed the TestBlockUnderConstruction timeout, which I missed last time. The rest appear to be flakes. Byte array native checksumming on client side (HDFS changes) Key: HDFS-6865 URL: https://issues.apache.org/jira/browse/HDFS-6865 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, performance Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6865.2.patch, HDFS-6865.3.patch, HDFS-6865.4.patch, HDFS-6865.5.patch, HDFS-6865.patch Refactor FSOutputSummer to buffer data and use the native checksum calculation functionality introduced in HADOOP-10975. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107263#comment-14107263 ] Arpit Agarwal commented on HDFS-6800: - Hi James, I apologize for the delay. I wanted to understand why we require the {{-rollback}} flag to rollback a rolling upgrade now. Also I'd like to test it out on a DN with a few million blocks for performance. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107286#comment-14107286 ] James Thomas commented on HDFS-6800: [~arpitagarwal], thanks for the response. The reason we use the flag in all cases is that we now support rolling upgrade rollbacks to an old DataNode layout version (a process that in the non-rolling upgrade case requires this flag), so I thought it would be easiest to just have administrators use the flag in all rollback cases (as you can see in the patch, there is no performance overhead if the rollback is not to a previous DataNode layout version). It's really just a convention, but I think it is simple and makes sense. As for the testing, as [~cmccabe] mentioned in his first comment above, I have already run tests with rolling upgrade to a new DN layout version with 100k blocks, and the hard link time was very fast. I don't think restoring the previous directory during a rollback is very expensive -- the only possible problem I see is the deletion of the current dir. Is that what you were concerned about? If so, what is the budget for rollback time? I can't imagine the deletion would take more than a couple tens of seconds. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5204) Stub implementation of getrlimit for Windows.
[ https://issues.apache.org/jira/browse/HDFS-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5204. - Resolution: Won't Fix This is no longer a problem now that HDFS-5202 has been committed. I'm resolving this. Stub implementation of getrlimit for Windows. - Key: HDFS-5204 URL: https://issues.apache.org/jira/browse/HDFS-5204 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-4949 Reporter: Chris Nauroth Assignee: Chris Nauroth The HDFS-4949 feature branch adds a JNI wrapper over the {{getrlimit}} function. This function does not exist on Windows. We need to provide a stub implementation so that the codebase can compile on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6799) The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system.
[ https://issues.apache.org/jira/browse/HDFS-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Megasthenis Asteris reassigned HDFS-6799: - Assignee: Megasthenis Asteris The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system. --- Key: HDFS-6799 URL: https://issues.apache.org/jira/browse/HDFS-6799 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.4.1 Reporter: Megasthenis Asteris Assignee: Megasthenis Asteris Priority: Minor Attachments: HDFS-6799.patch The invalidate(String bpid, Block[] invalidBlks) method in SimulatedFSDataset.java should remove all invalidBlks from the simulated file system. It currently fails to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107381#comment-14107381 ] Hadoop QA commented on HDFS-6914: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663633/HDFS-6914.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7724//console This message is automatically generated. Resolve huge memory consumption Issue with OIV processing PB-based fsimages --- Key: HDFS-6914 URL: https://issues.apache.org/jira/browse/HDFS-6914 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Hao Chen Labels: hdfs Fix For: 2.5.1 Attachments: HDFS-6914.patch For better managing and supporting a lot of large hadoop clusters in production, we internally need to automatically export fsimage to delimited text files in LSR style and then analyse with hive or pig or build system metrics for real-time analyzing. However due to the internal layout changes introduced by the protobuf-based fsimage, OIV processing program consumes excessive amount of memory. For example, in order to export the fsimage in size of 8GB, it should have taken about 85GB memory which is really not reasonable and impacted performance of other services badly in the same server. To resolve above problem, I submit this patch which will reduce memory consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6888) Remove audit logging of getFIleInfo()
[ https://issues.apache.org/jira/browse/HDFS-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107394#comment-14107394 ] Gera Shegalov commented on HDFS-6888: - Hi [~airbots], sorry for being unclear. [~kihwal] suggests: bq. We could have logAuditEvent() check cmd against getfileinfo or a *collection of such commands* and log at debug level. Picking this idea up can you introduce some conf like dfs.audit.loglevel.cmdlist=getfileinfo,anotherLogFloodingCmd,... In {{o.a.h.hdfs.server.namenode.FSNamesystem.DefaultAuditLogger#initialize}} you could read the list using {{auditDebugCmds conf.getTrimmedStrings(dfs.audit.debug.cmdlist}} and use it for filtering. Currenly v2 hardcodes getfileinfo. {code} --- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java +++ hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java @@ -359,6 +359,9 @@ private void logAuditEvent(boolean succeeded, UserGroupInformation ugi, InetAddress addr, String cmd, String src, String dst, HdfsFileStatus stat) { FileStatus status = null; +if(cmd.equals(getfileinfo) !auditLog.isDebugEnabled()) { + return; +} if (stat != null) { Path symlink = stat.isSymlink() ? new Path(stat.getSymlink()) : null; Path path = dst != null ? new Path(dst) : new Path(src); {code} Also auditLog.isDebugEnabled() is a cheaper check, and should be done before {{equals}} Remove audit logging of getFIleInfo() - Key: HDFS-6888 URL: https://issues.apache.org/jira/browse/HDFS-6888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Labels: log Attachments: HDFS-6888-2.patch, HDFS-6888.patch The audit logging of getFileInfo() was added in HDFS-3733. Since this is a one of the most called method, users have noticed that audit log is now filled with this. Since we now have HTTP request logging, this seems unnecessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107405#comment-14107405 ] Yongjun Zhang commented on HDFS-6776: - HI [~wheat9], may I know if my replies addressed your comments? thanks. distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
[jira] [Commented] (HDFS-6865) Byte array native checksumming on client side (HDFS changes)
[ https://issues.apache.org/jira/browse/HDFS-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107404#comment-14107404 ] James Thomas commented on HDFS-6865: I ran the tests that [~tlipcon] suggested and have some results. I created buffers of various sizes and repeatedly wrote them using FSDataOutputStream.write(). For each buffer size, I also wrapped FSDataOutputStream with a BufferedOutputStream. I made sure the packet size and block sizes were large enough that no actual writes to DataNodes occurred, so the times shown here primarily cover data buffering and checksumming and packet construction on the client side. The following times are all in milliseconds. Each test involved writing 8 MB of data to the stream. I only did one run for each of these data points, so there are a few unreproducible outliers (e.g. the 130ms in the 2^8 row), but the results are generally good enough that I didn't think averaging over a large number of runs was necessary. Some interpretation of the results: Naturally the time goes down with bigger buffers since we have fewer instructions (less method call overhead) per byte. At smaller buffer sizes the time for the checksum becomes more and more negligible compared to the other overheads per byte (after all, the checksum is a handful of instructions per byte even for the Java code), so we don't see much of a difference between the pre- and post-change code. The main case I was worried about was for input buffers (in the non-BufferedOuputStream case) larger than the original FSOutputSummer buffer (512 bytes) and smaller than the current FSOutputSummer buffer (5120 bytes), because these incur a buffer copy in the new FSOutputSummer (since there is now space for them in the FSOutputSummer's buffer) but were sent directly to the DFSOutputStream (to be copied into a packet) in the old FSOutputStream. But the data shows that this case (rows 2^9 and 2^10) is not problematic -- clearly the extra buffer copies are offset by the time saved by faster checksumming. ||log(Buffer Size)||pre-change||pre-change w/ BufferedStream||post-change||post-change w/ BufferedStream| |0|463|258|449|261| |1|249|125|213|118| |2|133|61|112|62| |3|42|16|56|22| |4|32|21|22|8| |5|15|14|18|8| |6|19|9|7|6| |7|18|28|11|5| |8|14|15|5|130| |9|12|12|4|4| |10|15|8|5|4| Byte array native checksumming on client side (HDFS changes) Key: HDFS-6865 URL: https://issues.apache.org/jira/browse/HDFS-6865 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, performance Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6865.2.patch, HDFS-6865.3.patch, HDFS-6865.4.patch, HDFS-6865.5.patch, HDFS-6865.patch Refactor FSOutputSummer to buffer data and use the native checksum calculation functionality introduced in HADOOP-10975. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107409#comment-14107409 ] Colin Patrick McCabe commented on HDFS-4486: It's pretty strange, all right. I'm also seeing a lot of too many open files exceptions, which makes me wonder if the JVM is failing to load the jar file because it's out of file descriptors? org/apache/hadoop/util/PerformanceAdvisory.java is certainly in hadoop-common, so there should be no problem depending on it in hadoop-hdfs. The hadoop-common unit tests also pass prior to the hadoop-hdfs tests throwing this exception, so we know that the class got compiled. Just a random guess, but maybe try adding a reference to PerformanceAdvisory inside MiniDFSCluster.java, to force Maven to realize that there is a dependency? Or try adding a constructor or other method to PerformanceAdvisory.. maybe it's getting somehow optimized out since it only has a static member now? Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6799) The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system.
[ https://issues.apache.org/jira/browse/HDFS-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107419#comment-14107419 ] Benoy Antony commented on HDFS-6799: +1 . Looks good to me . The invalidate method in SimulatedFSDataset.java failed to remove (invalidate) blocks from the file system. --- Key: HDFS-6799 URL: https://issues.apache.org/jira/browse/HDFS-6799 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.4.1 Reporter: Megasthenis Asteris Assignee: Megasthenis Asteris Priority: Minor Attachments: HDFS-6799.patch The invalidate(String bpid, Block[] invalidBlks) method in SimulatedFSDataset.java should remove all invalidBlks from the simulated file system. It currently fails to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6773) MiniDFSCluster can run dramatically faster
[ https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu updated HDFS-6773: -- Status: Patch Available (was: Open) MiniDFSCluster can run dramatically faster -- Key: HDFS-6773 URL: https://issues.apache.org/jira/browse/HDFS-6773 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Stephen Chu Attachments: HDFS-6773.1.patch The mini cluster is unnecessarily running with durable edit logs. The following change cut runtime of a single test from ~30s to ~10s. {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code} The mini cluster should default to this behavior after identifying the few edit log tests that probably depend on durable logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6773) MiniDFSCluster can run dramatically faster
[ https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu updated HDFS-6773: -- Attachment: HDFS-6773.1.patch Attaching a patch. * Add {{skipFsyncForTesting}} builder option which defaults to true to MiniDFSCluster. * Remove enabling fsync in {{TestFsDatasetCache}} and {{TestCacheDirectives}} because it's not needed. I left the instances of EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); in this first patch. Let me know if it's better just to remove them all, or to use the new Builder option in some of them to let new test readers be aware of this option. Quick scan through tests searching for fsync and I don't think any current tests require fsync. MiniDFSCluster can run dramatically faster -- Key: HDFS-6773 URL: https://issues.apache.org/jira/browse/HDFS-6773 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Stephen Chu Attachments: HDFS-6773.1.patch The mini cluster is unnecessarily running with durable edit logs. The following change cut runtime of a single test from ~30s to ~10s. {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code} The mini cluster should default to this behavior after identifying the few edit log tests that probably depend on durable logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107448#comment-14107448 ] Haohui Mai commented on HDFS-6776: -- bq. Right now, after security is examined, then there is the permission control, whoever issue the distcp command need to have right permission. Do you think we should disable copying stuff from secure cluster to insecure cluster, or to have another level of control? Thanks. I'm not sure what you refer to. Copying between secure and insecure clusters is a valid use case. What I have been saying that {{WebHdfsFileSystem}} is not the right place to change since the use case and having a security breach becomes indistinguishable. For this use case you might need to look at changing distcp or to ask the insecure cluster to issue dummy token. distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at
[jira] [Updated] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4852: Attachment: hadoop-site.tar.bz2 libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4852: Attachment: HDFS-4852.1.patch The attached patch makes the following changes: # Correct ouf-of-date information. # Mention that libhdfs is now compatible with Windows. # Recommend usage of the new {{hadoop classpath --glob}} or {{hadoop classpath --jar path}} commands to help get a correct classpath. I've also attached a built tarball of the site with these changes. libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: HDFS-4852.1.patch, hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4852: Status: Patch Available (was: Open) libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha, 3.0.0 Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: HDFS-4852.1.patch, hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HDFS-6826: - Status: Open (was: Patch Available) canceling patch, there is some special handling for editlogs that has to be done, working on it. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107492#comment-14107492 ] Haohui Mai commented on HDFS-4852: -- +1 libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: HDFS-4852.1.patch, hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4852: Hadoop Flags: Reviewed Thank you, Haohui. I'll commit this after Jenkins runs it. libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: HDFS-4852.1.patch, hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Thomas updated HDFS-6634: --- Attachment: inotify-design.4.pdf Integrated the content from inotify-intro into inotify-design as [~cmccabe] suggested. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6916) Like NameNode, NFS gateway should do name-id mapping with multiple sources
Brandon Li created HDFS-6916: Summary: Like NameNode, NFS gateway should do name-id mapping with multiple sources Key: HDFS-6916 URL: https://issues.apache.org/jira/browse/HDFS-6916 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Like what's already done in Namenode, NFS should also do the name id mapping in a similar way, e.g., shell/ldap/composit mappings. The difference here is that, NN does mapping from user name to group lists, while NFS from name to id. Some problem has been found with current name-id mapping: the LDAP server has lots of user account and it returns a limited number of entries to each search request. Current code (IdUserGroup) uses a shell command to retrieve user accounts. One shell command might not get the complete list, e.g., due to some limit set in LDAP server. Even it does, it's not necessary to cache all user account in the memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107580#comment-14107580 ] Zhe Zhang commented on HDFS-6898: - A related question is whether DN should pre-allocate disk space for received RBW blocks. If the underlying local FS supports fallocate, this will improve the disk layout of the block. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107588#comment-14107588 ] Arpit Agarwal commented on HDFS-6800: - Discussed offline with [~brandonli] and [~sureshms]. Adding back the {{-rollback}} flag should be okay. I will review the patch more carefully by tomorrow, I've forgotten some of the context on DN upgrades. I don't recall if we had a budget for the DN restart interval, the detailed design for rolling upgrades is at https://issues.apache.org/jira/secure/attachment/12632843/HDFSRollingUpgradesHighLevelDesign.v3.pdf. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107591#comment-14107591 ] Arpit Agarwal commented on HDFS-6800: - Rollback time is probably not as important. If a rollback is required, the cluster would already incur a serious downtime. So on second thoughts I don't think there is any perf concern with this specific patch. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107594#comment-14107594 ] James Thomas commented on HDFS-6800: Right, that's what I figured. Will talk to [~cmccabe] about his comments and post an updated patch hopefully today, but in any case the changes should be minor and won't affect the overall direction of the patch. Thanks again for your help. Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: James Thomas Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, HDFS-6800.patch We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6376: Attachment: HDFS-6376.008.patch Since the remaining work is trivial, I will just update the patch for [~dlmarion]. The changes included: # Wrap the exclusion logic in DFSUtil#getNNServiceRpcAddressesIncluded, # Use {{getTrimmedStringCollection}} instead of {{getStringCollection}} # Rename the new conf property to dfs.remote.cluster.nameservices # Add more unit tests Distcp data between two HA clusters requires another configuration -- Key: HDFS-6376 URL: https://issues.apache.org/jira/browse/HDFS-6376 Project: Hadoop HDFS Issue Type: Bug Components: datanode, federation, hdfs-client Affects Versions: 2.2.0, 2.3.0, 2.4.0 Environment: Hadoop 2.3.0 Reporter: Dave Marion Assignee: Dave Marion Fix For: 3.0.0 Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, HDFS-6376-patch-1.patch, HDFS-6376.008.patch User has to create a third set of configuration files for distcp when transferring data between two HA clusters. Consider the scenario in [1]. You cannot put all of the required properties in core-site.xml and hdfs-site.xml for the client to resolve the location of both active namenodes. If you do, then the datanodes from cluster A may join cluster B. I can not find a configuration option that tells the datanodes to federate blocks for only one of the clusters in the configuration. [1] http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background
[ https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-6867: Attachment: HDFS-6867-design-20140822.pdf Updated design which provides strong consistency without replying on variable length blocks. For DFSOutputStream, do pipeline recovery for a single block in the background -- Key: HDFS-6867 URL: https://issues.apache.org/jira/browse/HDFS-6867 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Attachments: HDFS-6867-design-20140820.pdf, HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf For DFSOutputStream, we should be able to do pipeline recovery in the background, while the user is continuing to write to the file. This is especially useful for long-lived clients that write to an HDFS file slowly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107619#comment-14107619 ] Hadoop QA commented on HDFS-6826: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663679/HDFS-6826v7.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestHDFSServerPorts org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.TestFsLimits org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings org.apache.hadoop.hdfs.server.namenode.TestEditLogRace org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.namenode.TestNameNodeRecovery org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7722//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7722//console This message is automatically generated. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107620#comment-14107620 ] Hadoop QA commented on HDFS-2975: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663663/HDFS-2975.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7726//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7726//console This message is automatically generated. Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107623#comment-14107623 ] Haohui Mai commented on HDFS-6376: -- I think it might make more sense to explicitly specify the name service that the DNs should report to. Since the changes are trivial, I'll provide another patch. Distcp data between two HA clusters requires another configuration -- Key: HDFS-6376 URL: https://issues.apache.org/jira/browse/HDFS-6376 Project: Hadoop HDFS Issue Type: Bug Components: datanode, federation, hdfs-client Affects Versions: 2.2.0, 2.3.0, 2.4.0 Environment: Hadoop 2.3.0 Reporter: Dave Marion Assignee: Dave Marion Fix For: 3.0.0 Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, HDFS-6376-patch-1.patch, HDFS-6376.008.patch User has to create a third set of configuration files for distcp when transferring data between two HA clusters. Consider the scenario in [1]. You cannot put all of the required properties in core-site.xml and hdfs-site.xml for the client to resolve the location of both active namenodes. If you do, then the datanodes from cluster A may join cluster B. I can not find a configuration option that tells the datanodes to federate blocks for only one of the clusters in the configuration. [1] http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107642#comment-14107642 ] Hadoop QA commented on HDFS-4486: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663695/HDFS-4486-20140822.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.ha.TestZKFailoverControllerStress org.apache.hadoop.hdfs.TestLeaseRecovery2 org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7721//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7721//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7721//console This message is automatically generated. Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107645#comment-14107645 ] Yongjun Zhang commented on HDFS-6776: - HI [~wheat9], thanks for the elaboration. I assume you are talking about {quote} What happens if the WebHdfsFileSystem intends to connect to a secure cluster but the attacker has somehow disabled the security of the cluster or successfully launch a MITM attack which keep returning NullToken? Instead of ignoring the failures, I think that WebHdfsFileSystem should fail explicitly because that the system is compromised. This is important as copying from a secure cluster to an insecure cluster can unintentionally breach the confidentiality. {quote} If secure HDFS can be attacked to return NullToken all the time, I wonder whether it can also be attacked to return a dummy token? Hi [~daryn] and [~tucu00], would you please comment? many thanks. distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at
[jira] [Updated] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-4486: Attachment: HDFS-4486-20140822-2.patch Adding Apache license header. Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822-2.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6865) Byte array native checksumming on client side (HDFS changes)
[ https://issues.apache.org/jira/browse/HDFS-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107661#comment-14107661 ] Hadoop QA commented on HDFS-6865: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663708/HDFS-6865.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.web.TestWebHDFSAcl The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.security.token.block.TestBlockToken {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7723//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7723//console This message is automatically generated. Byte array native checksumming on client side (HDFS changes) Key: HDFS-6865 URL: https://issues.apache.org/jira/browse/HDFS-6865 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, performance Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6865.2.patch, HDFS-6865.3.patch, HDFS-6865.4.patch, HDFS-6865.5.patch, HDFS-6865.patch Refactor FSOutputSummer to buffer data and use the native checksum calculation functionality introduced in HADOOP-10975. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107708#comment-14107708 ] Hadoop QA commented on HDFS-6634: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663753/inotify-design.4.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7730//console This message is automatically generated. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107714#comment-14107714 ] Colin Patrick McCabe commented on HDFS-4486: Weird that the NoClassDefFound exceptions are now gone in the latest Jenkins run. Like you, I could not reproduce them locally, either. Given that the test runs with NoClassDefFound had some other environment problems (like many too many open file descriptors messages), this seems likely to be a Jenkins environment issue. The test failures here aren't related (this is a trivial logging change) +1. Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822-2.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107734#comment-14107734 ] Hadoop QA commented on HDFS-6376: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663768/HDFS-6376.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestEnhancedByteBufferAccess {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7729//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7729//console This message is automatically generated. Distcp data between two HA clusters requires another configuration -- Key: HDFS-6376 URL: https://issues.apache.org/jira/browse/HDFS-6376 Project: Hadoop HDFS Issue Type: Bug Components: datanode, federation, hdfs-client Affects Versions: 2.2.0, 2.3.0, 2.4.0 Environment: Hadoop 2.3.0 Reporter: Dave Marion Assignee: Dave Marion Fix For: 3.0.0 Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, HDFS-6376-patch-1.patch, HDFS-6376.008.patch User has to create a third set of configuration files for distcp when transferring data between two HA clusters. Consider the scenario in [1]. You cannot put all of the required properties in core-site.xml and hdfs-site.xml for the client to resolve the location of both active namenodes. If you do, then the datanodes from cluster A may join cluster B. I can not find a configuration option that tells the datanodes to federate blocks for only one of the clusters in the configuration. [1] http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6376: - Attachment: HDFS-6376.000.patch Distcp data between two HA clusters requires another configuration -- Key: HDFS-6376 URL: https://issues.apache.org/jira/browse/HDFS-6376 Project: Hadoop HDFS Issue Type: Bug Components: datanode, federation, hdfs-client Affects Versions: 2.2.0, 2.3.0, 2.4.0 Environment: Hadoop 2.3.0 Reporter: Dave Marion Assignee: Dave Marion Fix For: 3.0.0 Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch User has to create a third set of configuration files for distcp when transferring data between two HA clusters. Consider the scenario in [1]. You cannot put all of the required properties in core-site.xml and hdfs-site.xml for the client to resolve the location of both active namenodes. If you do, then the datanodes from cluster A may join cluster B. I can not find a configuration option that tells the datanodes to federate blocks for only one of the clusters in the configuration. [1] http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107756#comment-14107756 ] Hadoop QA commented on HDFS-6729: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663686/HDFS-6729.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZones org.apache.hadoop.hdfs.server.datanode.TestBPOfferService org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7725//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7725//console This message is automatically generated. Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6729.000.patch, HDFS-6729.001.patch Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107755#comment-14107755 ] Hadoop QA commented on HDFS-6899: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663707/HDFS-6899.06.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZones org.apache.hadoop.hdfs.TestMissingBlocksAlert org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport org.apache.hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7720//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7720//console This message is automatically generated. Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch, HDFS-6899.05.patch, HDFS-6899.06.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5135) Umbrella JIRA for NFS end to end unit test frameworks
[ https://issues.apache.org/jira/browse/HDFS-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107757#comment-14107757 ] Brandon Li commented on HDFS-5135: -- [~zhz], the fragment decoder will always send the whole buffer to the next message handler, but the buffer read position will be moved by 4 bytes. In your case, the XDR is built on a ByteBuffer instead of ChannelBuffer, e.g: {noformat} @Override public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) { ChannelBuffer buf = (ChannelBuffer) e.getMessage(); ByteBuffer b = buf.toByteBuffer().asReadOnlyBuffer(); XDR rsp = new XDR(b, XDR.State.READING); // Get handle from create response ... RpcReply reply = RpcReply.read(rsp); ... {noformat} Also, in order to support parallel testing, the test should use a ephemeral port: config.setInt(nfs3.server.port, 0); If currently there is not way to know which port is bound by NFS, we should add a method to either Nfs3/RpcProgramNfs3 to get it. [~zhezhang], please feel free to create sub-tasks for your patches. Umbrella JIRA for NFS end to end unit test frameworks - Key: HDFS-5135 URL: https://issues.apache.org/jira/browse/HDFS-5135 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Zhe Zhang Attachments: TestRPCMessagesInNFS.java Currently, we have to manually start portmap and nfs3 processes to test patch and new functionalities. This JIRA is to track the effort to introduce a test framework to NFS unit test without starting standalone nfs3 processes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HDFS-5135) Umbrella JIRA for NFS end to end unit test frameworks
[ https://issues.apache.org/jira/browse/HDFS-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107757#comment-14107757 ] Brandon Li edited comment on HDFS-5135 at 8/23/14 12:38 AM: [~zhz], the fragment decoder will always send the whole buffer to the next message handler, but the buffer read position will be moved by 4 bytes. In your case, the XDR is built on a ByteBuffer instead of ChannelBuffer, e.g: {noformat} @Override public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) { ChannelBuffer buf = (ChannelBuffer) e.getMessage(); ByteBuffer b = buf.toByteBuffer().asReadOnlyBuffer(); XDR rsp = new XDR(b, XDR.State.READING); // Get handle from create response ... RpcReply reply = RpcReply.read(rsp); ... {noformat} Also, in order to support parallel testing, the test should use a ephemeral port: config.setInt(nfs3.server.port, 0); If currently there is not way to know which port is bound by NFS, we should add a method to either Nfs3/RpcProgramNfs3 to get it. Please feel free to create sub-tasks for your patches. was (Author: brandonli): [~zhz], the fragment decoder will always send the whole buffer to the next message handler, but the buffer read position will be moved by 4 bytes. In your case, the XDR is built on a ByteBuffer instead of ChannelBuffer, e.g: {noformat} @Override public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) { ChannelBuffer buf = (ChannelBuffer) e.getMessage(); ByteBuffer b = buf.toByteBuffer().asReadOnlyBuffer(); XDR rsp = new XDR(b, XDR.State.READING); // Get handle from create response ... RpcReply reply = RpcReply.read(rsp); ... {noformat} Also, in order to support parallel testing, the test should use a ephemeral port: config.setInt(nfs3.server.port, 0); If currently there is not way to know which port is bound by NFS, we should add a method to either Nfs3/RpcProgramNfs3 to get it. [~zhezhang], please feel free to create sub-tasks for your patches. Umbrella JIRA for NFS end to end unit test frameworks - Key: HDFS-5135 URL: https://issues.apache.org/jira/browse/HDFS-5135 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Zhe Zhang Attachments: TestRPCMessagesInNFS.java Currently, we have to manually start portmap and nfs3 processes to test patch and new functionalities. This JIRA is to track the effort to introduce a test framework to NFS unit test without starting standalone nfs3 processes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6917) Add an hdfs debug command to validate blocks, call recoverlease, etc.
Colin Patrick McCabe created HDFS-6917: -- Summary: Add an hdfs debug command to validate blocks, call recoverlease, etc. Key: HDFS-6917 URL: https://issues.apache.org/jira/browse/HDFS-6917 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe HDFS should have a debug command which could validate HDFS block files, call recoverLease, and have some other functionality. These commands would be purely for debugging and would appear under a separate command hierarchy inside the hdfs command. There would be no guarantee of API stability for these commands and the debug submenu would not be listed by just typing the hdfs command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6917) Add an hdfs debug command to validate blocks, call recoverlease, etc.
[ https://issues.apache.org/jira/browse/HDFS-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6917: --- Status: Patch Available (was: Open) Add an hdfs debug command to validate blocks, call recoverlease, etc. - Key: HDFS-6917 URL: https://issues.apache.org/jira/browse/HDFS-6917 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6917.001.patch HDFS should have a debug command which could validate HDFS block files, call recoverLease, and have some other functionality. These commands would be purely for debugging and would appear under a separate command hierarchy inside the hdfs command. There would be no guarantee of API stability for these commands and the debug submenu would not be listed by just typing the hdfs command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6917) Add an hdfs debug command to validate blocks, call recoverlease, etc.
[ https://issues.apache.org/jira/browse/HDFS-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6917: --- Attachment: HDFS-6917.001.patch Add an hdfs debug command to validate blocks, call recoverlease, etc. - Key: HDFS-6917 URL: https://issues.apache.org/jira/browse/HDFS-6917 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6917.001.patch HDFS should have a debug command which could validate HDFS block files, call recoverLease, and have some other functionality. These commands would be purely for debugging and would appear under a separate command hierarchy inside the hdfs command. There would be no guarantee of API stability for these commands and the debug submenu would not be listed by just typing the hdfs command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6917) Add an hdfs debug command to validate blocks, call recoverlease, etc.
[ https://issues.apache.org/jira/browse/HDFS-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107778#comment-14107778 ] Allen Wittenauer commented on HDFS-6917: Please put the debug command in sorted order in the case statement. Add an hdfs debug command to validate blocks, call recoverlease, etc. - Key: HDFS-6917 URL: https://issues.apache.org/jira/browse/HDFS-6917 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6917.001.patch HDFS should have a debug command which could validate HDFS block files, call recoverLease, and have some other functionality. These commands would be purely for debugging and would appear under a separate command hierarchy inside the hdfs command. There would be no guarantee of API stability for these commands and the debug submenu would not be listed by just typing the hdfs command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4852) libhdfs documentation is out of date
[ https://issues.apache.org/jira/browse/HDFS-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107796#comment-14107796 ] Hadoop QA commented on HDFS-4852: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663742/HDFS-4852.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7728//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7728//console This message is automatically generated. libhdfs documentation is out of date Key: HDFS-4852 URL: https://issues.apache.org/jira/browse/HDFS-4852 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Chris Nauroth Priority: Minor Labels: docs, libhdfs, noob Attachments: HDFS-4852.1.patch, hadoop-site.tar.bz2 The current libhdfs documentation is available here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/LibHdfs.html The paths and env variables here are out of date. HADOOP_PREFIX should probably be HADOOP_HDFS_HOME, and things are no longer located in {{/src/c++/libhdfs}}. There's also some missing text: The libhdfs APIs are a subset of: .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6917) Add an hdfs debug command to validate blocks, call recoverlease, etc.
[ https://issues.apache.org/jira/browse/HDFS-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107797#comment-14107797 ] Hadoop QA commented on HDFS-6917: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663808/HDFS-6917.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7734//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7734//console This message is automatically generated. Add an hdfs debug command to validate blocks, call recoverlease, etc. - Key: HDFS-6917 URL: https://issues.apache.org/jira/browse/HDFS-6917 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6917.001.patch HDFS should have a debug command which could validate HDFS block files, call recoverLease, and have some other functionality. These commands would be purely for debugging and would appear under a separate command hierarchy inside the hdfs command. There would be no guarantee of API stability for these commands and the debug submenu would not be listed by just typing the hdfs command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6879) Adding tracing to Hadoop RPC
[ https://issues.apache.org/jira/browse/HDFS-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-6879: --- Attachment: HDFS-6879-0.patch attaching the patch split from HDFS-5274. I posted the patch on [review board|https://reviews.apache.org/r/24999/] too. Adding tracing to Hadoop RPC Key: HDFS-6879 URL: https://issues.apache.org/jira/browse/HDFS-6879 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Masatake Iwasaki Attachments: HDFS-6879-0.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6879) Adding tracing to Hadoop RPC
[ https://issues.apache.org/jira/browse/HDFS-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107799#comment-14107799 ] Masatake Iwasaki commented on HDFS-6879: the patch consists of * adding optional tracing info to RPC RequestHeader. * adding the handling of tracing info to Client and Server. * adding span receiver which sends tracing info to collector. * adding the code to initialize receiver to NameNode and DataNode. Adding tracing to Hadoop RPC Key: HDFS-6879 URL: https://issues.apache.org/jira/browse/HDFS-6879 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Masatake Iwasaki Attachments: HDFS-6879-0.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4486) Add log category for long-running DFSClient notices
[ https://issues.apache.org/jira/browse/HDFS-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107803#comment-14107803 ] Hadoop QA commented on HDFS-4486: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663785/HDFS-4486-20140822-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7731//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7731//console This message is automatically generated. Add log category for long-running DFSClient notices --- Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Zhe Zhang Priority: Minor Attachments: HDFS-4486-20140820.patch, HDFS-4486-20140821.patch, HDFS-4486-20140822-2.patch, HDFS-4486-20140822.patch, hdfs-4486-20140821-2.patch There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107809#comment-14107809 ] Hadoop QA commented on HDFS-6908: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663603/HDFS-6908.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7732//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7732//console This message is automatically generated. incorrect snapshot directory diff generated by snapshot deletion Key: HDFS-6908 URL: https://issues.apache.org/jira/browse/HDFS-6908 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Juan Yu Assignee: Juan Yu Priority: Critical Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException. 1. create a directory and create a file under it 2. take a snapshot 3. create another file under that directory 4. take second snapshot 5. delete both files and the directory 6. delete second snapshot incorrect directory diff will be generated. Restart NN will throw NPE {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107821#comment-14107821 ] Hadoop QA commented on HDFS-6376: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663801/HDFS-6376.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestFileCreation org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7733//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7733//console This message is automatically generated. Distcp data between two HA clusters requires another configuration -- Key: HDFS-6376 URL: https://issues.apache.org/jira/browse/HDFS-6376 Project: Hadoop HDFS Issue Type: Bug Components: datanode, federation, hdfs-client Affects Versions: 2.2.0, 2.3.0, 2.4.0 Environment: Hadoop 2.3.0 Reporter: Dave Marion Assignee: Dave Marion Fix For: 3.0.0 Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch User has to create a third set of configuration files for distcp when transferring data between two HA clusters. Consider the scenario in [1]. You cannot put all of the required properties in core-site.xml and hdfs-site.xml for the client to resolve the location of both active namenodes. If you do, then the datanodes from cluster A may join cluster B. I can not find a configuration option that tells the datanodes to federate blocks for only one of the clusters in the configuration. [1] http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6899) Allow changing MiniDFSCluster volumes per DN and capacity per volume
[ https://issues.apache.org/jira/browse/HDFS-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107830#comment-14107830 ] Jing Zhao commented on HDFS-6899: - The failed unit tests all passed in my local run. +1 for the 06 patch. Allow changing MiniDFSCluster volumes per DN and capacity per volume Key: HDFS-6899 URL: https://issues.apache.org/jira/browse/HDFS-6899 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, test Affects Versions: 2.5.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6899.01.patch, HDFS-6899.02.patch, HDFS-6899.03.patch, HDFS-6899.04.patch, HDFS-6899.05.patch, HDFS-6899.06.patch MiniDFSCluster hardcodes the number of directories per volume to two. Propose removing the hard-coded restriction. It would be useful to limit the capacity of individual storage directories for testing purposes. There is already a way to do so for SimulatedFSDataset, we can add one when using real volumes. -- This message was sent by Atlassian JIRA (v6.2#6252)