[jira] [Updated] (HDFS-7206) Fix warning of token.Token: Cannot find class for token kind kms-dt for KMS when running jobs on Encryption zones
[ https://issues.apache.org/jira/browse/HDFS-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7206: - Description: This issue is produced when running MapReduce job and encryption zones are configured. {quote} 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user) 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 9b 09 07 04 02 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_141272197_0004 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt {quote} was: This issue is produced when running MapReduce job and encryption zones are configured. {quote} 14/10/08 02:30:53 WARN token.Token: Cannot find class for token kind kms-dt 14/10/08 02:30:53 WARN token.Token: Cannot find class for token kind kms-dt {quote} Fix warning of token.Token: Cannot find class for token kind kms-dt for KMS when running jobs on Encryption zones --- Key: HDFS-7206 URL: https://issues.apache.org/jira/browse/HDFS-7206 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu This issue is produced when running MapReduce job and encryption zones are configured. {quote} 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user) 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 9b 09 07 04 02 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_141272197_0004 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7206) Fix warning of token.Token: Cannot find class for token kind kms-dt for KMS when running jobs on Encryption zones
[ https://issues.apache.org/jira/browse/HDFS-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164812#comment-14164812 ] Yi Liu commented on HDFS-7206: -- This warning is because {{tokenKindMap}} is initialized through service loader of {{TokenIdentifier}} and _token kind_ is const for each one. But for {{org.apache.hadoop.security.token.delegation.web.DelegationTokenIdentifier}}, it has not been added to provider-configuration file and the _token kind_ is variable, so service loader can't find it. dt for KMS is in this case. It works fine since server side construct that kind of token identifier directly, this warnings happen in client side where try to log the token and need to decode it. The patch simply suppresses the warnings to debug message, otherwise there are too many such warning and will confuse user. Fix warning of token.Token: Cannot find class for token kind kms-dt for KMS when running jobs on Encryption zones --- Key: HDFS-7206 URL: https://issues.apache.org/jira/browse/HDFS-7206 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu This issue is produced when running MapReduce job and encryption zones are configured. {quote} 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user) 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:02 INFO security.TokenCache: Got dt for hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 9b 09 07 04 02 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_141272197_0004 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas
[ https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7090: - Attachment: HDFS-7090.0.patch Attache an initial patch that use JNI that calls sendfile() under Linux and CopyFileEx under Windows for unbuffered file copy. Use unbuffered writes when persisting in-memory replicas Key: HDFS-7090 URL: https://issues.apache.org/jira/browse/HDFS-7090 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Attachments: HDFS-7090.0.patch The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to persistent storage. It would be better to use unbuffered writes to avoid churning page cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5175) Provide clients a way to set IP header bits on connections
[ https://issues.apache.org/jira/browse/HDFS-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164835#comment-14164835 ] Haohui Mai commented on HDFS-5175: -- bq. HttpsURLConnection has setSSLSocketFactory. Need to test out if it works for regular http. Hooking URLConnectionFactory in the filesystem object should be able address the problem for webhdfs. Provide clients a way to set IP header bits on connections -- Key: HDFS-5175 URL: https://issues.apache.org/jira/browse/HDFS-5175 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Lohit Vijayarenu It would be very helpful if we had ability for clients to set IP headers when they make socket connections for data transfers. We were looking into setting up QoS using DSCP bit and saw that there is no easy way to let clients pass down a specific value when clients make connection to DataNode. As a quick fix we did something similar to io.file.buffer.size where client could pass down DSCP integer value and when DFSClient opens a stream, it could set the value on socket using setTrafficClass Opening this JIRA to get more inputs from others who have had experience and might have already thought about this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7210) Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164888#comment-14164888 ] Hadoop QA commented on HDFS-7210: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673828/HDFS-7210-003.patch against trunk revision 2a51494. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream org.apache.hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure org.apache.hadoop.hdfs.TestLeaseRecovery2 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap org.apache.hadoop.hdfs.TestQuota org.apache.hadoop.hdfs.TestPersistBlocks org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.hdfs.server.namenode.TestINodeFile org.apache.hadoop.hdfs.TestFileAppend2 org.apache.hadoop.hdfs.TestGetFileChecksum org.apache.hadoop.hdfs.TestFileAppend org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength org.apache.hadoop.fs.contract.hdfs.TestHDFSContractAppend org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot org.apache.hadoop.hdfs.server.mover.TestStorageMover org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.fs.TestSymlinkHdfsFileContext org.apache.hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot org.apache.hadoop.hdfs.TestFileAppendRestart org.apache.hadoop.fs.TestSymlinkHdfsFileSystem org.apache.hadoop.hdfs.TestFileAppend3 org.apache.hadoop.hdfs.server.namenode.TestFSImageWithSnapshot org.apache.hadoop.hdfs.TestAppendDifferentChecksum The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDefaultNameNodePort {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8370//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8370//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8370//console This message is automatically generated. Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient --- Key: HDFS-7210 URL: https://issues.apache.org/jira/browse/HDFS-7210 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7210-001.patch, HDFS-7210-002.patch, HDFS-7210-003.patch Currently DFSClient does 2 RPCs to namenode for an append operation. {{append()}} for re-opening the file and getting the last block, {{getFileInfo()}} Another on to get HdfsFileState If we can combine result of these 2 calls and make one RPC, then it can reduce load on NameNode. For the backward compatibility we need to keep existing {{append()}} call as is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6133) Make Balancer support exclude specified path
[ https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-6133: --- Attachment: HDFS-6133-3.patch Update patch for merge the trunk. {quote} Why we always pass false in below? 1653new Sender(out).writeBlock(b, accessToken, clientname, targets, 1654srcNode, stage, 0, 0, 0, 0, blockSender.getChecksum(), 1655cachingStrategy, false); {quote} This code path happens when NameNode ask DataNode send block to other DataNode(DatanodeProtocol.DNA_TRANSFER), it's not trigged by client, so there is no need pinning the block in this case. {quote} We will never copy a block? 925 if (datanode.data.getPinning(block)) 926 String msg = Not able to copy block + block.getBlockId() + + 927 to + peer.getRemoteAddressString() + because it's pinned ; 928 LOG.info(msg); 929 sendResponse(ERROR, msg); Any thing to help ensure replica count does not rot when this pinning is enabled? {quote} When the block is under replicate, NameNode will send DatanodeProtocol.DNA_TRANSFER command to DataNode and it handled by DataTransfer, pinning won't affect that. Make Balancer support exclude specified path Key: HDFS-6133 URL: https://issues.apache.org/jira/browse/HDFS-6133 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover, namenode Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, HDFS-6133.patch Currently, run Balancer will destroying Regionserver's data locality. If getBlocks could exclude blocks belongs to files which have specific path prefix, like /hbase, then we can run Balancer without destroying Regionserver's data locality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7209: - Attachment: HDFS-7209.001.patch The patch makes two changes: *1.* fill in the EDEK queue when creating encryption zone. *2.* For creating file in FSN, if {{provider}} is null, it's not necessary to hold the read lock and check whether the file is in an encryption zone, that's a bit more efficient. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path
[ https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164904#comment-14164904 ] Hadoop QA commented on HDFS-6133: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673853/HDFS-6133-3.patch against trunk revision 2a51494. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8372//console This message is automatically generated. Make Balancer support exclude specified path Key: HDFS-6133 URL: https://issues.apache.org/jira/browse/HDFS-6133 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover, namenode Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, HDFS-6133.patch Currently, run Balancer will destroying Regionserver's data locality. If getBlocks could exclude blocks belongs to files which have specific path prefix, like /hbase, then we can run Balancer without destroying Regionserver's data locality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7209: - Status: Patch Available (was: Open) fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7210) Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7210: Attachment: HDFS-7210-004.patch Fixed test failures. Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient --- Key: HDFS-7210 URL: https://issues.apache.org/jira/browse/HDFS-7210 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7210-001.patch, HDFS-7210-002.patch, HDFS-7210-003.patch, HDFS-7210-004.patch Currently DFSClient does 2 RPCs to namenode for an append operation. {{append()}} for re-opening the file and getting the last block, {{getFileInfo()}} Another on to get HdfsFileState If we can combine result of these 2 calls and make one RPC, then it can reduce load on NameNode. For the backward compatibility we need to keep existing {{append()}} call as is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7014) Implement input and output streams to DataNode for native client
[ https://issues.apache.org/jira/browse/HDFS-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164923#comment-14164923 ] Zhanwei Wang commented on HDFS-7014: Hi [~cmccabe] Thanks for assign it to me and I'm going to remove exception from C\+\+ API according to HDFS-7207 Implement input and output streams to DataNode for native client Key: HDFS-7014 URL: https://issues.apache.org/jira/browse/HDFS-7014 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: 0001-HDFS-7014-001.patch, HDFS-7014-pnative.002.patch, HDFS-7014.patch Implement Client - Namenode RPC protocol and support Namenode HA. Implement Client - Datanode RPC protocol. Implement some basic server side class such as ExtendedBlock and LocatedBlock -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7210) Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165003#comment-14165003 ] Hadoop QA commented on HDFS-7210: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673861/HDFS-7210-004.patch against trunk revision 2a51494. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8374//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8374//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8374//console This message is automatically generated. Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient --- Key: HDFS-7210 URL: https://issues.apache.org/jira/browse/HDFS-7210 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7210-001.patch, HDFS-7210-002.patch, HDFS-7210-003.patch, HDFS-7210-004.patch Currently DFSClient does 2 RPCs to namenode for an append operation. {{append()}} for re-opening the file and getting the last block, {{getFileInfo()}} Another on to get HdfsFileState If we can combine result of these 2 calls and make one RPC, then it can reduce load on NameNode. For the backward compatibility we need to keep existing {{append()}} call as is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165002#comment-14165002 ] Hadoop QA commented on HDFS-7209: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673854/HDFS-7209.001.patch against trunk revision 2a51494. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.TestHDFSFileSystemContract org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8373//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8373//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8373//console This message is automatically generated. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165021#comment-14165021 ] Yi Liu commented on HDFS-7209: -- The jenkins says java.lang.NoSuchMethodError: org.apache.hadoop.crypto.key.kms.KMSClientProvider.getEncKeyQueueSize, but actually it exists, re-upload the patch to re-trigger Jenkins. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7209: - Attachment: (was: HDFS-7209.001.patch) fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7209: - Attachment: HDFS-7209.001.patch fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7210) Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165028#comment-14165028 ] Vinayakumar B commented on HDFS-7210: - Failures and ReleaseAudit warnings are not related to current patch. ReleaseAudit warning is from YARN commit. Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient --- Key: HDFS-7210 URL: https://issues.apache.org/jira/browse/HDFS-7210 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7210-001.patch, HDFS-7210-002.patch, HDFS-7210-003.patch, HDFS-7210-004.patch Currently DFSClient does 2 RPCs to namenode for an append operation. {{append()}} for re-opening the file and getting the last block, {{getFileInfo()}} Another on to get HdfsFileState If we can combine result of these 2 calls and make one RPC, then it can reduce load on NameNode. For the backward compatibility we need to keep existing {{append()}} call as is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7202) Should be able to omit package name of SpanReceiver on hadoop trace -add
[ https://issues.apache.org/jira/browse/HDFS-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165036#comment-14165036 ] Hudson commented on HDFS-7202: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #706 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/706/]) HDFS-7202. Should be able to omit package name of SpanReceiver on hadoop trace -add (iwasakims via cmccabe) (cmccabe: rev d996235285e5047f731e3d3fc4c6e6214caa10aa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTraceAdmin.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/tracing/SpanReceiverHost.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Should be able to omit package name of SpanReceiver on hadoop trace -add -- Key: HDFS-7202 URL: https://issues.apache.org/jira/browse/HDFS-7202 Project: Hadoop HDFS Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7202-0.patch This is not consistent with the configuration from file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7203) Concurrent appending to the same file can cause data corruption
[ https://issues.apache.org/jira/browse/HDFS-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165037#comment-14165037 ] Hudson commented on HDFS-7203: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #706 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/706/]) HDFS-7203. Concurrent appending to the same file can cause data (kihwal: rev 853cb704edf54207313c0e70c9c375212d288b60) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend3.java Concurrent appending to the same file can cause data corruption --- Key: HDFS-7203 URL: https://issues.apache.org/jira/browse/HDFS-7203 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7203.patch When multiple threads are calling append against the same file, the file can get corrupt. The root of the problem is that a stale file stat may be used for append in {{DFSClient}}. If the file size changes between {{getFileStatus()}} and {{namenode.append()}}, {{DataStreamer}} will get confused about how to align data to the checksum boundary and break the assumption made by data nodes. When it happens, datanode may not write the last checksum. On the next append attempt, datanode won't be able to reposition for the partial chunk, since the last checksum is missing. The append will fail after running out of data nodes to copy the partial block to. However, if there are more threads that try to append, this leads to a more serious situation. In a few minutes, a lease recovery and block recovery will happen. The block recovery truncates the block to the ack'ed size in order to make sure to keep only the portion of data that is checksum-verified. The problem is, during the last successful append, the last data node verified the checksum and ack'ed before writing data and wrong metadata to the disk and all data nodes in the pipeline wrote the same wrong metadata. So the ack'ed size contains the corrupt portion of the data. Since block recovery does not perform any checksum verification, the file sizes are adjusted and after {{commitBlockSynchronization()}}, another thread will be allowed to append to the corrupt file. This latent corruption may not be detected for a very long time. The first failing {{append()}} would have created a partial copy of the block in the temporary directory of every data node in the cluster. After this failure, it is likely under replicated, so the file will be scheduled for replication after being closed. Before HDFS-6948, replication didn't work until a node is added or restarted because of the temporary file being on all data nodes. As a result, the corruption could not be detected by replication. After HDFS-6948, the corruption will be detected after the file is closed by lease recovery or subsequent append-close. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7218) FSNamesystem ACL operations should write to audit log on failure
[ https://issues.apache.org/jira/browse/HDFS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7218: --- Attachment: HDFS-7218.002.patch Uploading a new patch to fix the javac warnings. The test failure (TestDNFencingWithReplication) fails on my local machine both with and without the patch in the same way as the test-patch run. The release audit warning appears to be some unrelated jenkins issue. FSNamesystem ACL operations should write to audit log on failure Key: HDFS-7218 URL: https://issues.apache.org/jira/browse/HDFS-7218 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7218.001.patch, HDFS-7218.002.patch Various Acl methods in FSNamesystem do not write to the audit log when the operation is not successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7202) Should be able to omit package name of SpanReceiver on hadoop trace -add
[ https://issues.apache.org/jira/browse/HDFS-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165141#comment-14165141 ] Hudson commented on HDFS-7202: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1896 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/]) HDFS-7202. Should be able to omit package name of SpanReceiver on hadoop trace -add (iwasakims via cmccabe) (cmccabe: rev d996235285e5047f731e3d3fc4c6e6214caa10aa) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/tracing/SpanReceiverHost.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTraceAdmin.java Should be able to omit package name of SpanReceiver on hadoop trace -add -- Key: HDFS-7202 URL: https://issues.apache.org/jira/browse/HDFS-7202 Project: Hadoop HDFS Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7202-0.patch This is not consistent with the configuration from file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7203) Concurrent appending to the same file can cause data corruption
[ https://issues.apache.org/jira/browse/HDFS-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165142#comment-14165142 ] Hudson commented on HDFS-7203: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1896 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/]) HDFS-7203. Concurrent appending to the same file can cause data (kihwal: rev 853cb704edf54207313c0e70c9c375212d288b60) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend3.java Concurrent appending to the same file can cause data corruption --- Key: HDFS-7203 URL: https://issues.apache.org/jira/browse/HDFS-7203 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7203.patch When multiple threads are calling append against the same file, the file can get corrupt. The root of the problem is that a stale file stat may be used for append in {{DFSClient}}. If the file size changes between {{getFileStatus()}} and {{namenode.append()}}, {{DataStreamer}} will get confused about how to align data to the checksum boundary and break the assumption made by data nodes. When it happens, datanode may not write the last checksum. On the next append attempt, datanode won't be able to reposition for the partial chunk, since the last checksum is missing. The append will fail after running out of data nodes to copy the partial block to. However, if there are more threads that try to append, this leads to a more serious situation. In a few minutes, a lease recovery and block recovery will happen. The block recovery truncates the block to the ack'ed size in order to make sure to keep only the portion of data that is checksum-verified. The problem is, during the last successful append, the last data node verified the checksum and ack'ed before writing data and wrong metadata to the disk and all data nodes in the pipeline wrote the same wrong metadata. So the ack'ed size contains the corrupt portion of the data. Since block recovery does not perform any checksum verification, the file sizes are adjusted and after {{commitBlockSynchronization()}}, another thread will be allowed to append to the corrupt file. This latent corruption may not be detected for a very long time. The first failing {{append()}} would have created a partial copy of the block in the temporary directory of every data node in the cluster. After this failure, it is likely under replicated, so the file will be scheduled for replication after being closed. Before HDFS-6948, replication didn't work until a node is added or restarted because of the temporary file being on all data nodes. As a result, the corruption could not be detected by replication. After HDFS-6948, the corruption will be detected after the file is closed by lease recovery or subsequent append-close. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7220) TestDataNodeMetrics fails in trunk
Ted Yu created HDFS-7220: Summary: TestDataNodeMetrics fails in trunk Key: HDFS-7220 URL: https://issues.apache.org/jira/browse/HDFS-7220 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMetrics/testSendDataPacketMetrics/ : {code} java.lang.NoClassDefFoundError: org/apache/hadoop/util/IntrusiveCollection$IntrusiveIterator at org.apache.hadoop.util.IntrusiveCollection.iterator(IntrusiveCollection.java:213) at org.apache.hadoop.util.IntrusiveCollection.clear(IntrusiveCollection.java:368) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.clearPendingCachingCommands(DatanodeManager.java:1590) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1262) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1590) at org.apache.hadoop.hdfs.server.namenode.NameNode.stopCommonServices(NameNode.java:658) at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:823) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1717) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1696) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testSendDataPacketMetrics(TestDataNodeMetrics.java:94) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165162#comment-14165162 ] Charles Lamb commented on HDFS-7209: [~hitliuyi], Good idea for this patch. It will at least fill the cache in the case where the EZ is created by the admin in the same NN session. A little nit: I think your IDE added whitespace right before ValueQueue#getSize(), and in KMSClientProvider right before the VisibleForTesting tag. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165210#comment-14165210 ] Hadoop QA commented on HDFS-7209: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673874/HDFS-7209.001.patch against trunk revision 2a51494. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8375//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8375//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8375//console This message is automatically generated. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165215#comment-14165215 ] Kihwal Lee commented on HDFS-7097: -- I am not sure about calling triggerBlockReports(). The block finalization makes the IBR to be sent out right away, so this extra call will actually do nothing. It's a bug, if NNs are not getting it. But your concern is valid. Depending on the testing environment, things can get slow down and the delay of 1 second may not be enough. I can make it periodically check for a longer period of time. The test will terminate sooner when it succeeds, but will take an extra time when fails. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7218) FSNamesystem ACL operations should write to audit log on failure
[ https://issues.apache.org/jira/browse/HDFS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7218: --- Target Version/s: 2.7.0 (was: 2.6.0) FSNamesystem ACL operations should write to audit log on failure Key: HDFS-7218 URL: https://issues.apache.org/jira/browse/HDFS-7218 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7218.001.patch, HDFS-7218.002.patch Various Acl methods in FSNamesystem do not write to the audit log when the operation is not successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7202) Should be able to omit package name of SpanReceiver on hadoop trace -add
[ https://issues.apache.org/jira/browse/HDFS-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165232#comment-14165232 ] Hudson commented on HDFS-7202: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1921 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1921/]) HDFS-7202. Should be able to omit package name of SpanReceiver on hadoop trace -add (iwasakims via cmccabe) (cmccabe: rev d996235285e5047f731e3d3fc4c6e6214caa10aa) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTraceAdmin.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/tracing/SpanReceiverHost.java Should be able to omit package name of SpanReceiver on hadoop trace -add -- Key: HDFS-7202 URL: https://issues.apache.org/jira/browse/HDFS-7202 Project: Hadoop HDFS Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7202-0.patch This is not consistent with the configuration from file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7203) Concurrent appending to the same file can cause data corruption
[ https://issues.apache.org/jira/browse/HDFS-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165233#comment-14165233 ] Hudson commented on HDFS-7203: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1921 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1921/]) HDFS-7203. Concurrent appending to the same file can cause data (kihwal: rev 853cb704edf54207313c0e70c9c375212d288b60) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend3.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Concurrent appending to the same file can cause data corruption --- Key: HDFS-7203 URL: https://issues.apache.org/jira/browse/HDFS-7203 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7203.patch When multiple threads are calling append against the same file, the file can get corrupt. The root of the problem is that a stale file stat may be used for append in {{DFSClient}}. If the file size changes between {{getFileStatus()}} and {{namenode.append()}}, {{DataStreamer}} will get confused about how to align data to the checksum boundary and break the assumption made by data nodes. When it happens, datanode may not write the last checksum. On the next append attempt, datanode won't be able to reposition for the partial chunk, since the last checksum is missing. The append will fail after running out of data nodes to copy the partial block to. However, if there are more threads that try to append, this leads to a more serious situation. In a few minutes, a lease recovery and block recovery will happen. The block recovery truncates the block to the ack'ed size in order to make sure to keep only the portion of data that is checksum-verified. The problem is, during the last successful append, the last data node verified the checksum and ack'ed before writing data and wrong metadata to the disk and all data nodes in the pipeline wrote the same wrong metadata. So the ack'ed size contains the corrupt portion of the data. Since block recovery does not perform any checksum verification, the file sizes are adjusted and after {{commitBlockSynchronization()}}, another thread will be allowed to append to the corrupt file. This latent corruption may not be detected for a very long time. The first failing {{append()}} would have created a partial copy of the block in the temporary directory of every data node in the cluster. After this failure, it is likely under replicated, so the file will be scheduled for replication after being closed. Before HDFS-6948, replication didn't work until a node is added or restarted because of the temporary file being on all data nodes. As a result, the corruption could not be detected by replication. After HDFS-6948, the corruption will be detected after the file is closed by lease recovery or subsequent append-close. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7221) TestDNFencingWithReplication fails consistently
Charles Lamb created HDFS-7221: -- Summary: TestDNFencingWithReplication fails consistently Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7221: --- Attachment: HDFS-7221.001.patch A small bump in the timeout period in the test makes it pass consistently on my local machine. We can see if the jenkins runs agree. TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7221: --- Status: Patch Available (was: Open) TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7097: - Attachment: HDFS-7097.patch Made the change in the test case. It now checks the result every 100ms for up to 5 seconds. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165313#comment-14165313 ] Kihwal Lee commented on HDFS-7217: -- The precommit testing actually stopped at {{TestInterDatanodeProtocol}}, which runs without any problem on several of my env. And the release audit warning of course bogus. It was agains a git internal bookeeping file. I think some build slaves are having problems. Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165323#comment-14165323 ] Daryn Sharp commented on HDFS-7217: --- +1 Definitely a needed change. If possible, it would be nice for a test to ensure the receiving block IBRs are batched although it's probably very difficult and may warrant a separate jira. Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Status: Open (was: Patch Available) Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Attachment: HDFS-7195-branch-2.2.patch Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Status: Patch Available (was: Open) Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Attachment: HDFS-7195-trunk.2.patch That's a good idea, Yi. Here is a new patch including updates for the comments in hadoop-env.sh. This now requires separate patches for trunk and branch-2. Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7218) FSNamesystem ACL operations should write to audit log on failure
[ https://issues.apache.org/jira/browse/HDFS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165331#comment-14165331 ] Hadoop QA commented on HDFS-7218: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673893/HDFS-7218.002.patch against trunk revision 2a51494. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestMultipleNNDataBlockScanner org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.server.balancer.TestBalancer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8376//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8376//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8376//console This message is automatically generated. FSNamesystem ACL operations should write to audit log on failure Key: HDFS-7218 URL: https://issues.apache.org/jira/browse/HDFS-7218 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7218.001.patch, HDFS-7218.002.patch Various Acl methods in FSNamesystem do not write to the audit log when the operation is not successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165342#comment-14165342 ] Ming Ma commented on HDFS-7097: --- Thanks, Kihwal. LGTM. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7218) FSNamesystem ACL operations should write to audit log on failure
[ https://issues.apache.org/jira/browse/HDFS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165355#comment-14165355 ] Charles Lamb commented on HDFS-7218: TestMultipleNNDataBlockScanner and TestBalancer pass on my local machine with the patch applied. TestDNFencingWithReplication is a known test problem (HDFS-7221). FSNamesystem ACL operations should write to audit log on failure Key: HDFS-7218 URL: https://issues.apache.org/jira/browse/HDFS-7218 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7218.001.patch, HDFS-7218.002.patch Various Acl methods in FSNamesystem do not write to the audit log when the operation is not successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas
[ https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7090: - Fix Version/s: 3.0.0 Affects Version/s: (was: HDFS-6581) 2.6.0 Status: Patch Available (was: In Progress) Use unbuffered writes when persisting in-memory replicas Key: HDFS-7090 URL: https://issues.apache.org/jira/browse/HDFS-7090 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Fix For: 3.0.0 Attachments: HDFS-7090.0.patch The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to persistent storage. It would be better to use unbuffered writes to avoid churning page cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165372#comment-14165372 ] Kihwal Lee commented on HDFS-7217: -- bq. If possible, it would be nice for a test to ensure the receiving block IBRs are batched although it's probably very difficult and may warrant a separate jira. I've manually verified it. Timing and batching aside, the correctness is covered by TestPipelinesFailover. E.g. if a receiving IBR is queued and delayed like deleted blocks or simply get dropped, this test case fails. Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-7217: Assignee: Kihwal Lee Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165381#comment-14165381 ] Hudson commented on HDFS-7217: -- FAILURE: Integrated in Hadoop-trunk-Commit #6221 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6221/]) HDFS-7217. Better batching of IBRs. Contributed by Kihwal Lee. (kihwal: rev db71bb54bcc75b71c5841b25ceb03fb0218c6d4f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165385#comment-14165385 ] Kihwal Lee commented on HDFS-7217: -- Thanks for the review, Daryn. I've committed this to trunk, branch-2 and branch-2.6. Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7217: - Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 2.6.0 Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7217) Better batching of IBRs
[ https://issues.apache.org/jira/browse/HDFS-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165387#comment-14165387 ] Yongjun Zhang commented on HDFS-7217: - Hi [~kihwal], thanks for addressing my question. I think it's a good change. Better batching of IBRs --- Key: HDFS-7217 URL: https://issues.apache.org/jira/browse/HDFS-7217 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 2.6.0 Attachments: HDFS-7217.patch After HDFS-2691 (pipeline recovery in HA), the number of IBR(incremental block report)s have doubled. Since processing of IBR requires exclusive FSNamesystem write lock, this can be a source of significant overhead on clusters with high write load. On one of the busy clusters, we have observed 60 to 70 percent of available handlers being constantly occupied by IBRs. This degrades throughput greatly when compared to 0.23. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7174) Support for more efficient large directories
[ https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165389#comment-14165389 ] Kihwal Lee commented on HDFS-7174: -- bq. As Yi Liu pointed out, the current patch has a problem. If we go back and forth between switchingThreshold (say, by repeatedly adding and removing a single element to a directory), we pay a very high cost. To prevent this, the threshold for converting a INodeHashedArrayList back to a simple INodeArrayList should be lower than the threshold for doing the opposite conversion. There is low watermark, with is 90% of the conversion threshold. So it won't flip back and forth like that. Support for more efficient large directories Key: HDFS-7174 URL: https://issues.apache.org/jira/browse/HDFS-7174 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch When the number of children under a directory grows very large, insertion becomes very costly. E.g. creating 1M entries takes 10s of minutes. This is because the complexity of an insertion is O\(n\). As the size of a list grows, the overhead grows n^2. (integral of linear function). It also causes allocations and copies of big arrays. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7174) Support for more efficient large directories
[ https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165389#comment-14165389 ] Kihwal Lee edited comment on HDFS-7174 at 10/9/14 5:16 PM: --- bq. As Yi Liu pointed out, the current patch has a problem. If we go back and forth between switchingThreshold (say, by repeatedly adding and removing a single element to a directory), we pay a very high cost. To prevent this, the threshold for converting a INodeHashedArrayList back to a simple INodeArrayList should be lower than the threshold for doing the opposite conversion. There is a low watermark, with is 90% of the conversion threshold. So it won't flip back and forth like that. was (Author: kihwal): bq. As Yi Liu pointed out, the current patch has a problem. If we go back and forth between switchingThreshold (say, by repeatedly adding and removing a single element to a directory), we pay a very high cost. To prevent this, the threshold for converting a INodeHashedArrayList back to a simple INodeArrayList should be lower than the threshold for doing the opposite conversion. There is low watermark, with is 90% of the conversion threshold. So it won't flip back and forth like that. Support for more efficient large directories Key: HDFS-7174 URL: https://issues.apache.org/jira/browse/HDFS-7174 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch When the number of children under a directory grows very large, insertion becomes very costly. E.g. creating 1M entries takes 10s of minutes. This is because the complexity of an insertion is O\(n\). As the size of a list grows, the overhead grows n^2. (integral of linear function). It also causes allocations and copies of big arrays. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7174) Support for more efficient large directories
[ https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165389#comment-14165389 ] Kihwal Lee edited comment on HDFS-7174 at 10/9/14 5:18 PM: --- bq. As Yi Liu pointed out, the current patch has a problem. If we go back and forth between switchingThreshold (say, by repeatedly adding and removing a single element to a directory), we pay a very high cost. To prevent this, the threshold for converting a INodeHashedArrayList back to a simple INodeArrayList should be lower than the threshold for doing the opposite conversion. There is a low watermark, witch is 90% of the conversion threshold. So it won't flip back and forth like that. was (Author: kihwal): bq. As Yi Liu pointed out, the current patch has a problem. If we go back and forth between switchingThreshold (say, by repeatedly adding and removing a single element to a directory), we pay a very high cost. To prevent this, the threshold for converting a INodeHashedArrayList back to a simple INodeArrayList should be lower than the threshold for doing the opposite conversion. There is a low watermark, with is 90% of the conversion threshold. So it won't flip back and forth like that. Support for more efficient large directories Key: HDFS-7174 URL: https://issues.apache.org/jira/browse/HDFS-7174 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch When the number of children under a directory grows very large, insertion becomes very costly. E.g. creating 1M entries takes 10s of minutes. This is because the complexity of an insertion is O\(n\). As the size of a list grows, the overhead grows n^2. (integral of linear function). It also causes allocations and copies of big arrays. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165401#comment-14165401 ] Hadoop QA commented on HDFS-7195: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673927/HDFS-7195-trunk.2.patch against trunk revision 8d7c549. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.crypto.random.TestOsSecureRandom {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8380//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8380//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8380//console This message is automatically generated. Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7190) Bad use of Preconditions in startFileInternal()
[ https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawson Choong reassigned HDFS-7190: --- Assignee: Dawson Choong Bad use of Preconditions in startFileInternal() --- Key: HDFS-7190 URL: https://issues.apache.org/jira/browse/HDFS-7190 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Dawson Choong Labels: newbie The following precondition is in the middle of startFileInternal() {code} feInfo = new FileEncryptionInfo(suite, version, Preconditions.checkNotNull(feInfo); {code} Preconditions are recommended to be used in the beginning of the method. In this case the check is no-op anyways, because the variable has just been constructed. Should be just removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7201) Fix typos in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawson Choong reassigned HDFS-7201: --- Assignee: Dawson Choong Fix typos in hdfs-default.xml - Key: HDFS-7201 URL: https://issues.apache.org/jira/browse/HDFS-7201 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.1 Reporter: Konstantin Shvachko Assignee: Dawson Choong Labels: newbie Found the following typos in hdfs-default.xml: repliaction directoires teh tranfer spage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165434#comment-14165434 ] Allen Wittenauer commented on HDFS-7175: bq. could we at least considering making number of files a configurable option (with a reasonable default value of course) as a feature... Probably better to handle that as a separate JIRA given that there will likely be lots of discussion around options, etc. Plus that is a feature request whereas the current code here is all bug fix. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6673: Attachment: HDFS-6673.003.patch Add a parameter to specify temporary file/dir path for leveldb storage, which is used for storing intermediate results loaded from PB fsimage. Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7219) Update Hadoop's lz4 to the latest version
[ https://issues.apache.org/jira/browse/HDFS-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165447#comment-14165447 ] Chris Nauroth commented on HDFS-7219: - Thank you for doing this, Colin. It looks good. I verified that this is the same code as the published r123 version. I built successfully and ran {{TestLz4CompressorDecompressor}} on Mac, Linux and Windows. The new version cleans up some compilation warnings that we had been seeing with the prior version. I ran the MR native task tests too. I think I'm seeing some flakiness in those tests, but it's nothing related to this patch. Can we also delete lz4_encoder.h please? It appears r123 no longer uses that header. Update Hadoop's lz4 to the latest version - Key: HDFS-7219 URL: https://issues.apache.org/jira/browse/HDFS-7219 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7219.001.patch We should update Hadoop's copy of the lz4 compression library to the latest version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7219) Update Hadoop's lz4 to the latest version
[ https://issues.apache.org/jira/browse/HDFS-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165453#comment-14165453 ] Chris Nauroth commented on HDFS-7219: - One more very minor note: this probably ought to be a HADOOP jira instead of HDFS. Update Hadoop's lz4 to the latest version - Key: HDFS-7219 URL: https://issues.apache.org/jira/browse/HDFS-7219 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7219.001.patch We should update Hadoop's copy of the lz4 compression library to the latest version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165494#comment-14165494 ] Yongjun Zhang commented on HDFS-7221: - Hi [~clamb], Thanks for reporting this issue and providing patch. See my update in https://issues.apache.org/jira/browse/HADOOP-11045, this test failure is on the top.:-) TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165511#comment-14165511 ] Jing Zhao commented on HDFS-7195: - The patches for trunk and branch-2 both look good to me. +1. Thanks for working on this, Chris and Yi! Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165512#comment-14165512 ] Jing Zhao commented on HDFS-7195: - The failed unit test and the release audit warning are both known issue and should be unrelated. Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Task Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7220) TestDataNodeMetrics fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165520#comment-14165520 ] Yongjun Zhang commented on HDFS-7220: - Hi Ted, Using the tool I was advertising in HADOOP-11045 (and you suggested the location to put the tool), here is what I got for this job: {code} [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j Hadoop-Hdfs-trunk -n 14 Recently FAILED builds in url: https://builds.apache.org//job/Hadoop-Hdfs-trunk THERE ARE 2 builds (out of 5) that have failed tests in the past 14 days, as listed below: ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/testReport (2014-10-09 04:30:40) Failed test: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testDataNodeMetrics Failed test: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testRoundTripAckMetric Failed test: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testSendDataPacketMetrics Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress Failed test: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketMetrics ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1895/testReport (2014-10-08 04:30:40) Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress Among 5 runs examined, all failed tests #failedRuns: testName: 2: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress 1: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketMetrics 1: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testSendDataPacketMetrics 1: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testRoundTripAckMetric 1: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testDataNodeMetrics [yzhang@localhost jenkinsftf]$ {code} The failure you reported here is likely introduced by the commits happened in between: ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/testReport (2014-10-09 04:30:40) ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1895/testReport (2014-10-08 04:30:40) Thanks. TestDataNodeMetrics fails in trunk -- Key: HDFS-7220 URL: https://issues.apache.org/jira/browse/HDFS-7220 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1896/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMetrics/testSendDataPacketMetrics/ : {code} java.lang.NoClassDefFoundError: org/apache/hadoop/util/IntrusiveCollection$IntrusiveIterator at org.apache.hadoop.util.IntrusiveCollection.iterator(IntrusiveCollection.java:213) at org.apache.hadoop.util.IntrusiveCollection.clear(IntrusiveCollection.java:368) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.clearPendingCachingCommands(DatanodeManager.java:1590) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1262) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1590) at org.apache.hadoop.hdfs.server.namenode.NameNode.stopCommonServices(NameNode.java:658) at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:823) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1717) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1696) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testSendDataPacketMetrics(TestDataNodeMetrics.java:94) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165527#comment-14165527 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673911/HDFS-7097.patch against trunk revision 8d7c549. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1272 javac compiler warnings (more than the trunk's current 1267 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.balancer.TestBalancer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOTests org.apache.hadoop.hdfs.Tests org.apache.hadoop.hdfs.server.blockmanagement.TestCorruptReplicaInfo org.apache.hadoop.hTests org.apache.hadoopTests org.apache.hadoop.hdfs.server.Tests org.apache.hadoop.hdfs.sTests org.apache.hadooTests org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChanTests org.apache.hadoop.traciTests org.apache.hadoop.hdfs.TestFileCreationClient {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//artifact/patchprocess/patchReleaseAuditProblems.txt Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165528#comment-14165528 ] Hadoop QA commented on HDFS-7221: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673910/HDFS-7221.001.patch against trunk revision 8d7c549. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1272 javac compiler warnings (more than the trunk's current 1267 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOTests org.apache.hadoop.hdfs.Tests org.apache.hadoop.hdfs.server.blockmanagement.TestCorruptReplicaInfo org.apache.hadoop.hTests org.apache.hadoopTests org.apache.hadoop.hdfs.server.Tests org.apache.hadoop.hdfs.sTests org.apache.hadooTests org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChanTests org.apache.hadoop.traciTests org.apache.hadoop.hdfs.TestFileCreationClient {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8377//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8377//artifact/patchprocess/patchReleaseAuditProblems.txt Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8377//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8377//console This message is automatically generated. TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7207) libhdfs3 should not expose exceptions in public C++ API
[ https://issues.apache.org/jira/browse/HDFS-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165531#comment-14165531 ] Colin Patrick McCabe commented on HDFS-7207: bq. 5) If new application wants to use the feature in libhdfs3 but not implemented yet (like encryption), it can use libhdfs or libwebhdfs with C interface and switch to libhdfs3 later. Let's say that my application only uses HDFS, so I feel confident using libhdfs3. I use a libhdfs3-specific C\+\+ interface for my code. Suddenly hdfs adds a new feature that (like encryption) that libhdfs3 doesn't support. Now what do I do? I can't switch to a different library because the interface I used was libhddfs3-only. The only reasonable solution is to have a C\+\+ interface that all our libraries support, or to have no C\+\+ interface. I am fine with either one of those solutions. I'm afraid that I am -1 on creating a C\+\+ interface unless we can support that interface into the future. If we tell people that New application \[should\] usually prefer C interface instead of C\+\+ because C interface is more stable (as you have commented) then why have the C\+\+ interface in the first place? We should not add code if we explicitly recommend that people don't use it. It would just be creating a trap for people to fall into. And then those people would show up on the mailing list, complaining when we broke things in an upgrade. And we'd have to revert all the changes. It has happened in the past for other components. This is not a big deal when you have an internal tool (like libhdfs3 originally was inside your organization) and you can coordinate a flag day when everyone switches from one version to another. But in Hadoop we have to support old versions for years. If we break APIs (or even worse, ABIs) across a minor release, people get upset and they ask us to revert the change. Since the current libhdfs3 C\+\+ interface you have proposed exposes so many internal things, this will make it almost impossible to change anything after a release. In contrast, I would feel comfortable telling people to use the C\+\+ interface I've posted here. We can support it into the future without any additional effort beyond what we already exert to maintain the C API. bq. I do not think it is necessary to add additional C++ interface for libhdfs and libwebhdfs. I am not the biggest C\+\+ fan, but shouldn't people have the right to program in the language they choose? We have a lot of users of HDFS that are C\+\+, starting with Impala and HAWQ. Please take another look and let me know if there's anything I can improve. Right now, the interface I posted seems strictly better than the previous proposal. I think you should seriously consider it and let me know where it falls short for you. libhdfs3 should not expose exceptions in public C++ API --- Key: HDFS-7207 URL: https://issues.apache.org/jira/browse/HDFS-7207 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-7207.001.patch There are three major disadvantages of exposing exceptions in the public API: * Exposing exceptions in public APIs forces the downstream users to be compiled with {{-fexceptions}}, which might be infeasible in many use cases. * It forces other bindings to properly handle all C++ exceptions, which might be infeasible especially when the binding is generated by tools like SWIG. * It forces the downstream users to properly handle all C++ exceptions, which can be cumbersome as in certain cases it will lead to undefined behavior (e.g., throwing an exception in a destructor is undefined.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Issue Type: Improvement (was: Task) Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165552#comment-14165552 ] Yongjun Zhang commented on HDFS-7221: - Hi [~clamb], seems your patch did not resolve the issue. maybe you can bump the timeout to a very high number and try again, just to see if it still fail, and thus guide us to look into the root cause. Thanks. TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6956) Allow dynamically changing the tracing level in Hadoop servers
[ https://issues.apache.org/jira/browse/HDFS-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6956: -- Fix Version/s: 2.6.0 Allow dynamically changing the tracing level in Hadoop servers -- Key: HDFS-6956 URL: https://issues.apache.org/jira/browse/HDFS-6956 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HDFS-6956.002.patch, HDFS-6956.003.patch, HDFS-6956.004.patch, HDFS-6956.005.patch We should allow users to dynamically change the tracing level in Hadoop servers. The easiest way to do this is probably to have an RPC accessible only to the superuser that changes tracing settings. This would allow us to turn on and off tracing on the NameNode, DataNode, etc. at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7195: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed this to trunk, branch-2 and branch-2.6. Yi and Jing, thank you for the code reviews. Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Fix For: 2.6.0 Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7026) Introduce a string constant for Failed to obtain user group info...
[ https://issues.apache.org/jira/browse/HDFS-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165566#comment-14165566 ] Yongjun Zhang commented on HDFS-7026: - The audit warnings happens is not introduced by this fix, for example https://builds.apache.org/job/PreCommit-HDFS-Build/8361//artifact/patchprocess/patchReleaseAuditProblems.txt also has it. Rerunning the timeouted TestHASafeMode succeeded locally. The TestDNFencingWithReplication failure was reported as HDFS-7721, not relevant to the change here. Introduce a string constant for Failed to obtain user group info... - Key: HDFS-7026 URL: https://issues.apache.org/jira/browse/HDFS-7026 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Trivial Attachments: HDFS-7206.001.patch There are multiple places that refer to hard-coded string {{Failed to obtain user group information:}}, which serves as a contract between different places. Filing this jira to replace the hardcoded string with a constant to make it easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165571#comment-14165571 ] Hudson commented on HDFS-7195: -- FAILURE: Integrated in Hadoop-trunk-Commit #6224 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6224/]) HDFS-7195. Update user doc of secure mode about Datanodes don't require root or jsvc. Contributed by Chris Nauroth. (cnauroth: rev 9097183983cd96ab0fe56b2564d8a63f78b2845c) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh * hadoop-common-project/hadoop-common/src/site/apt/SecureMode.apt.vm Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Fix For: 2.6.0 Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7026) Introduce a string constant for Failed to obtain user group info...
[ https://issues.apache.org/jira/browse/HDFS-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165573#comment-14165573 ] Yongjun Zhang commented on HDFS-7026: - Sorry typo, it's HDFS-7221 instead of HDFS-7721. Introduce a string constant for Failed to obtain user group info... - Key: HDFS-7026 URL: https://issues.apache.org/jira/browse/HDFS-7026 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Trivial Attachments: HDFS-7206.001.patch There are multiple places that refer to hard-coded string {{Failed to obtain user group information:}}, which serves as a contract between different places. Filing this jira to replace the hardcoded string with a constant to make it easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7221: --- Attachment: HDFS-7221.002.patch [~yzhangal], I bumped it to 5 mins. TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch, HDFS-7221.002.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165623#comment-14165623 ] Aaron T. Myers commented on HDFS-7097: -- The patch looks pretty good, and in thinking about it a fair bit I think it won't regress the issue I was trying to address in HDFS-5064, though Kihwal I would appreciate if you could confirm that as well. A few small comments: # Does {{FSNamesystem#rollEditLog}} need to take the nsLock as well? Seems like it might, given that tailing edits no longer is taking the normal FSNS rw lock. # Similarly for {{FSNamesystem#(start|end)Checkpoint}}, though that's less obvious to me. # Seems a little strange to me to be calling this new lock the nsLock, when that's also what we've been calling the main FSNS rw lock all this time. I'd suggest renaming this to the checkpoint lock or something, to more clearly distinguish its purpose. # I think you can now remove some of the other stuff added as part of HDFS-5064, e.g. the entire {{longReadLock}} I believe was only actually being locked for read during checkpointing. Thanks a lot for working on this, Kihwal. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas
[ https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165687#comment-14165687 ] Hadoop QA commented on HDFS-7090: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673839/HDFS-7090.0.patch against trunk revision db71bb5. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8381//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8381//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8381//console This message is automatically generated. Use unbuffered writes when persisting in-memory replicas Key: HDFS-7090 URL: https://issues.apache.org/jira/browse/HDFS-7090 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Fix For: 3.0.0 Attachments: HDFS-7090.0.patch The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to persistent storage. It would be better to use unbuffered writes to avoid churning page cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas
[ https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165716#comment-14165716 ] Chris Nauroth commented on HDFS-7090: - Hi, Xiaoyu. The patch looks good. In addition to investigating the test failures, here are a few comments: # I noticed that {{fstat}} can result in errno {{EOVERFLOW}} according to the man page. Can you please add a mapping for this to errno_enum.c? This probably will never happen in practice, but just in case, it would be nice to get a clear diagnostic. # I don't think the test needs to do {{TEST_DIR.mkdirs()}}. This is already done in the {{Before}} method. # Also in the test, let's write some bytes into the file before copying it. Otherwise, I'm not sure if it's fully exercising the change. Use unbuffered writes when persisting in-memory replicas Key: HDFS-7090 URL: https://issues.apache.org/jira/browse/HDFS-7090 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Fix For: 3.0.0 Attachments: HDFS-7090.0.patch The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to persistent storage. It would be better to use unbuffered writes to avoid churning page cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6919) Enforce a single limit for RAM disk usage and replicas cached via locking
[ https://issues.apache.org/jira/browse/HDFS-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165729#comment-14165729 ] Suresh Srinivas commented on HDFS-6919: --- [~cmccabe], based on the discussions in this jira, is this issue still a blocker for merging memory tier? If not, I propose merging the memory tier changes to branch-2 to get it into release 2.6. As I have stated several times in the mailing thread, this is an important feature for exploring memory tier in HDFS. It may not be used by everyone. But it is a great starting point to explore how applications can use it and we can as [~arpitagarwal] has suggested tweak the feature further. It would be great to summarize where we stand. Enforce a single limit for RAM disk usage and replicas cached via locking - Key: HDFS-6919 URL: https://issues.apache.org/jira/browse/HDFS-6919 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Arpit Agarwal Assignee: Colin Patrick McCabe Priority: Blocker The DataNode can have a single limit for memory usage which applies to both replicas cached via CCM and replicas on RAM disk. See comments [1|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106025page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106025], [2|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106245page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106245] and [3|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106575page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106575] for discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165740#comment-14165740 ] Hadoop QA commented on HDFS-6673: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673941/HDFS-6673.003.patch against trunk revision db71bb5. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8382//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8382//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8382//console This message is automatically generated. Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165774#comment-14165774 ] Andrew Wang commented on HDFS-7209: --- This looks very good to me. Just a couple small things: * Could we move the generateEDEK call so it's right after the call to getMetadata? I think it's a bit cleaner to do all the KeyProvider operations before entering createEncryptionZoneInt, and more importantly it's good to minimize the time between doing the checkOperation and trying to take the lock. This is used to check for an HA failover, so we check before taking the lock to potentially early exit, take the lock, then check again. * Also would be good to rename the test case so it reflects what's being checked, e.g. {{testCreateEZPopulatesEDEKCache}} or something. Thanks for working on this Yi! fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165774#comment-14165774 ] Andrew Wang edited comment on HDFS-7209 at 10/9/14 9:27 PM: This looks very good to me. Just a couple small things: * Could we move the generateEDEK call so it's in createEncryptionZone after the call to getMetadata? I think it's a bit cleaner to do all the KeyProvider operations before entering createEncryptionZoneInt, and more importantly it's good to minimize the time between doing the checkOperation and trying to take the lock. This is used to check for an HA failover, so we check before taking the lock to potentially early exit, take the lock, then check again. * Also would be good to rename the test case so it reflects what's being checked, e.g. {{testCreateEZPopulatesEDEKCache}} or something. Thanks for working on this Yi! was (Author: andrew.wang): This looks very good to me. Just a couple small things: * Could we move the generateEDEK call so it's right after the call to getMetadata? I think it's a bit cleaner to do all the KeyProvider operations before entering createEncryptionZoneInt, and more importantly it's good to minimize the time between doing the checkOperation and trying to take the lock. This is used to check for an HA failover, so we check before taking the lock to potentially early exit, take the lock, then check again. * Also would be good to rename the test case so it reflects what's being checked, e.g. {{testCreateEZPopulatesEDEKCache}} or something. Thanks for working on this Yi! fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6743: -- Status: Open (was: Patch Available) Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Siqi Li Attachments: HDFS-5928.v3.patch, HDFS-6743.v1.patch, HDFS-6743.v2.patch The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6743: -- Attachment: (was: HDFS-5928.v3.patch) Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Siqi Li Attachments: HDFS-6743.v1.patch, HDFS-6743.v2.patch The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6743: -- Status: Patch Available (was: Open) Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Siqi Li Attachments: HDFS-5928.v3.patch, HDFS-6743.v1.patch, HDFS-6743.v2.patch The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6743: -- Attachment: HDFS-5928.v3.patch Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Siqi Li Attachments: HDFS-5928.v3.patch, HDFS-6743.v1.patch, HDFS-6743.v2.patch The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5928) show namespace and namenode ID on NN dfshealth page
[ https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-5928: -- Attachment: HDFS-5928.v3.patch show namespace and namenode ID on NN dfshealth page --- Key: HDFS-5928 URL: https://issues.apache.org/jira/browse/HDFS-5928 Project: Hadoop HDFS Issue Type: Improvement Reporter: Siqi Li Assignee: Siqi Li Attachments: HDFS-5928.v2.patch, HDFS-5928.v3.patch, HDFS-5928.v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5928) show namespace and namenode ID on NN dfshealth page
[ https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-5928: -- Status: Patch Available (was: Open) show namespace and namenode ID on NN dfshealth page --- Key: HDFS-5928 URL: https://issues.apache.org/jira/browse/HDFS-5928 Project: Hadoop HDFS Issue Type: Improvement Reporter: Siqi Li Assignee: Siqi Li Attachments: HDFS-5928.v2.patch, HDFS-5928.v3.patch, HDFS-5928.v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5928) show namespace and namenode ID on NN dfshealth page
[ https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-5928: -- Status: Open (was: Patch Available) show namespace and namenode ID on NN dfshealth page --- Key: HDFS-5928 URL: https://issues.apache.org/jira/browse/HDFS-5928 Project: Hadoop HDFS Issue Type: Improvement Reporter: Siqi Li Assignee: Siqi Li Attachments: HDFS-5928.v2.patch, HDFS-5928.v3.patch, HDFS-5928.v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6919) Enforce a single limit for RAM disk usage and replicas cached via locking
[ https://issues.apache.org/jira/browse/HDFS-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165849#comment-14165849 ] Colin Patrick McCabe commented on HDFS-6919: I'd really like to see us shrink the write cache size dynamically based on things entering the read cache, as we discussed in the last few comments. Since this could be considered an incompatible change, I'd really really like to see it happen before this goes into 2.6. I don't think it's that much work. If the schedule makes this impossible, then we can at least add a release note that this behavior will be implemented soon, so that we can do it in 2.6.1 (or whatever follow-on release) without users being surprised. Enforce a single limit for RAM disk usage and replicas cached via locking - Key: HDFS-6919 URL: https://issues.apache.org/jira/browse/HDFS-6919 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Arpit Agarwal Assignee: Colin Patrick McCabe Priority: Blocker The DataNode can have a single limit for memory usage which applies to both replicas cached via CCM and replicas on RAM disk. See comments [1|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106025page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106025], [2|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106245page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106245] and [3|https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14106575page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14106575] for discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently
[ https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165854#comment-14165854 ] Hadoop QA commented on HDFS-7221: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673970/HDFS-7221.002.patch against trunk revision d7b647f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8383//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8383//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8383//console This message is automatically generated. TestDNFencingWithReplication fails consistently --- Key: HDFS-7221 URL: https://issues.apache.org/jira/browse/HDFS-7221 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7221.001.patch, HDFS-7221.002.patch TestDNFencingWithReplication consistently fails with a timeout, both in jenkins runs and on my local machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7218) FSNamesystem ACL operations should write to audit log on failure
[ https://issues.apache.org/jira/browse/HDFS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165858#comment-14165858 ] Charles Lamb commented on HDFS-7218: Hi [~cnauroth], If you get a chance, could you please take a look at these diffs? Thanks. FSNamesystem ACL operations should write to audit log on failure Key: HDFS-7218 URL: https://issues.apache.org/jira/browse/HDFS-7218 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7218.001.patch, HDFS-7218.002.patch Various Acl methods in FSNamesystem do not write to the audit log when the operation is not successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7222) Expose DataNode network errors as a metric
Charles Lamb created HDFS-7222: -- Summary: Expose DataNode network errors as a metric Key: HDFS-7222 URL: https://issues.apache.org/jira/browse/HDFS-7222 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor It would be useful to track datanode network errors and expose them as a metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6824) Additional user documentation for HDFS encryption.
[ https://issues.apache.org/jira/browse/HDFS-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6824: -- Target Version/s: 2.6.0 (was: fs-encryption (HADOOP-10150 and HDFS-6134)) Affects Version/s: (was: fs-encryption (HADOOP-10150 and HDFS-6134)) 2.6.0 Status: Patch Available (was: Open) Additional user documentation for HDFS encryption. -- Key: HDFS-6824 URL: https://issues.apache.org/jira/browse/HDFS-6824 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-6824.001.patch We'd like to better document additional things about HDFS encryption: setup and configuration, using alternate access methods (namely WebHDFS and HttpFS), other misc improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6824) Additional user documentation for HDFS encryption.
[ https://issues.apache.org/jira/browse/HDFS-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6824: -- Attachment: hdfs-6824.001.patch Here's a patch that incorporates Tucu's and Mike's suggestions, and also adds some new sections about distcp, example usage, and using /.reserved/raw. Additional user documentation for HDFS encryption. -- Key: HDFS-6824 URL: https://issues.apache.org/jira/browse/HDFS-6824 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-6824.001.patch We'd like to better document additional things about HDFS encryption: setup and configuration, using alternate access methods (namely WebHDFS and HttpFS), other misc improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6824) Additional user documentation for HDFS encryption.
[ https://issues.apache.org/jira/browse/HDFS-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166075#comment-14166075 ] Hadoop QA commented on HDFS-6824: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12674039/hdfs-6824.001.patch against trunk revision 596702a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8385//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8385//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8385//console This message is automatically generated. Additional user documentation for HDFS encryption. -- Key: HDFS-6824 URL: https://issues.apache.org/jira/browse/HDFS-6824 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-6824.001.patch We'd like to better document additional things about HDFS encryption: setup and configuration, using alternate access methods (namely WebHDFS and HttpFS), other misc improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5928) show namespace and namenode ID on NN dfshealth page
[ https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166074#comment-14166074 ] Hadoop QA commented on HDFS-5928: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12674009/HDFS-5928.v3.patch against trunk revision 8d94114. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing org.apache.hadoop.hdfs.server.namenode.TestEditLog {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8384//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8384//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8384//console This message is automatically generated. show namespace and namenode ID on NN dfshealth page --- Key: HDFS-5928 URL: https://issues.apache.org/jira/browse/HDFS-5928 Project: Hadoop HDFS Issue Type: Improvement Reporter: Siqi Li Assignee: Siqi Li Attachments: HDFS-5928.v2.patch, HDFS-5928.v3.patch, HDFS-5928.v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7195) Update user doc of secure mode about Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166092#comment-14166092 ] Yi Liu commented on HDFS-7195: -- Thanks Chris for the patch and Jing for the review. Update user doc of secure mode about Datanodes don't require root or jsvc - Key: HDFS-7195 URL: https://issues.apache.org/jira/browse/HDFS-7195 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, security Reporter: Yi Liu Assignee: Chris Nauroth Fix For: 2.6.0 Attachments: HDFS-7195-branch-2.2.patch, HDFS-7195-trunk.2.patch, HDFS-7195.1.patch, hadoop-site.tar.bz2 HDFS-2856 adds support that Datanodes don't require root or jsvc. If {{dfs.data.transfer.protection}} is configured and {{dfs.http.policy}} is _HTTPS_ONLY_, then secure dataNode doesn't need to use privileged port. This has not been updated in the latest user doc of secure mode. This JIRA is to fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7209) fill the key queue when creating encryption zone
[ https://issues.apache.org/jira/browse/HDFS-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166136#comment-14166136 ] Yi Liu commented on HDFS-7209: -- Sure, thanks Charles and Andrew for the review, will rebase/update the patch later. fill the key queue when creating encryption zone Key: HDFS-7209 URL: https://issues.apache.org/jira/browse/HDFS-7209 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption, performance Affects Versions: 2.6.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7209.001.patch Currently when creating file in an encryption zone for the first time, key provider will get bunch of keys from KMS and fill in the queue. It will take some time. We can initialize the key queue when creating the encryption zone by admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7026) Introduce a string constant for Failed to obtain user group info...
[ https://issues.apache.org/jira/browse/HDFS-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-7026: - Target Version/s: 2.7.0 +1, the patch looks good to me. I'm going to commit this momentarily. Introduce a string constant for Failed to obtain user group info... - Key: HDFS-7026 URL: https://issues.apache.org/jira/browse/HDFS-7026 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Trivial Attachments: HDFS-7206.001.patch There are multiple places that refer to hard-coded string {{Failed to obtain user group information:}}, which serves as a contract between different places. Filing this jira to replace the hardcoded string with a constant to make it easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7026) Introduce a string constant for Failed to obtain user group info...
[ https://issues.apache.org/jira/browse/HDFS-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-7026: - Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) I've just committed this to trunk and branch-2. Thanks a lot for the contribution, Yongjun. Introduce a string constant for Failed to obtain user group info... - Key: HDFS-7026 URL: https://issues.apache.org/jira/browse/HDFS-7026 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7206.001.patch There are multiple places that refer to hard-coded string {{Failed to obtain user group information:}}, which serves as a contract between different places. Filing this jira to replace the hardcoded string with a constant to make it easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)