[jira] [Commented] (HDFS-5237) Get rid of nodes' registration names
[ https://issues.apache.org/jira/browse/HDFS-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776071#comment-13776071 ] Junping Du commented on HDFS-5237: -- Hi Eli and Philip, I think another existing config DFS_DATANODE_DNS_INTERFACE_KEY or dfs.datanode.dns.interface already plays the role to select which hostname the DN wanted to use (in DataNode.getHostName(conf)). So it may not be necessary for dfs.datanode.hostname to re-purpose for this. Isn't it? Per discussion between Colin and me in HDFS-5208, we try to find some alternative ways to get rid of registration name which could be specified for any name by user and proposed for tests with MiniDFSCluster to start with multiple faked nodes before. The simply removal of this config is not working as attached demo patch shows - all MiniDFSCluster related tests are failed as there is no way to fake nodes now. I guess we should keep this config for unit tests only or identify some new ways to achieve fake nodes with removing it. Thoughts? Get rid of nodes' registration names Key: HDFS-5237 URL: https://issues.apache.org/jira/browse/HDFS-5237 Project: Hadoop HDFS Issue Type: Bug Reporter: Junping Du Fix For: 3.0.0 Attachments: HDFS-5237.patch Per discussion in HDFS-5208 and may be some other discussions before, Node's registration name is pretty confusing and shouldn't be used in production environment as topology resolving issues. So we remove related configuration dfs.datanode.hostname or its old one slave.host.name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5254) Reconfigure DN's topology without restart DN
[ https://issues.apache.org/jira/browse/HDFS-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5254: - Issue Type: New Feature (was: Bug) Reconfigure DN's topology without restart DN Key: HDFS-5254 URL: https://issues.apache.org/jira/browse/HDFS-5254 Project: Hadoop HDFS Issue Type: New Feature Reporter: Junping Du In some cases (like: VM live migration in virtualization or SDN in cloud), the DN's network topology is changed in runtime. We should figure out some way to reconfigure it without stop/start DN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5254) Reconfigure DN's topology without restart DN
Junping Du created HDFS-5254: Summary: Reconfigure DN's topology without restart DN Key: HDFS-5254 URL: https://issues.apache.org/jira/browse/HDFS-5254 Project: Hadoop HDFS Issue Type: Bug Reporter: Junping Du In some cases (like: VM live migration in virtualization or SDN in cloud), the DN's network topology is changed in runtime. We should figure out some way to reconfigure it without stop/start DN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens
[ https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776086#comment-13776086 ] Junping Du commented on HDFS-5208: -- Hi Colin, I updated my new comments in HDFS-5237 and thanks for your comments there. IMO, HDFS-5237 shouldn't be a blocking jira for this as no registration name (only IP) going to cache backed by CachedDNSToSwitchMapping because of following code in DatanodeManager.resolveNetworkLocation (DatanodeID). {code} if (dnsToSwitchMapping instanceof CachedDNSToSwitchMapping) { names.add(node.getIpAddr()); } else { names.add(node.getHostName()); } {code} What do you think? Only clear network location cache on specific nodes if invalid NetworkTopology happens -- Key: HDFS-5208 URL: https://issues.apache.org/jira/browse/HDFS-5208 Project: Hadoop HDFS Issue Type: Improvement Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-5208-v1.patch After HDFS-4521, once a DN is registered with invalid networktopology, all cached rack info in DNSToSwitchMapping will be cleared. We should only clear cache on specific nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776104#comment-13776104 ] Junping Du commented on HDFS-5222: -- Hi Nicholas, Thanks for the patch. A couple of comments: {code} + private static boolean hasSpace(final long blockSize, final long remaining, + final long scheduled) { +final long required = blockSize * HdfsConstants.MIN_BLOCKS_FOR_WRITE; +return required remaining - scheduled * blockSize; + } {code} Shall we rename it to something like: notEnoughSpace? As this means space is not enough. Isn't it? {code} -final long requiredSize = blockSize * HdfsConstants.MIN_BLOCKS_FOR_WRITE; -if (requiredSize storage.getRemaining()) { +if (hasSpace(blockSize, storage.getRemaining(), storage.getBlocksScheduled())) { logNodeIsNotChosen(storage, the storage does not have enough space ); return false; } -//TODO: move getBlocksScheduled() to DatanodeStorageInfo. -long remaining = node.getRemaining() - - (node.getBlocksScheduled() * blockSize); // check the remaining capacity of the target machine -if (requiredSize remaining) { +if (hasSpace(blockSize, node.getRemaining(), node.getBlocksScheduled())) { logNodeIsNotChosen(storage, the node does not have enough space ); return false; } {code} Shall we remove checking node's remaining as previous storage capacity checking is good enough? Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776169#comment-13776169 ] Tsz Wo (Nicholas), SZE commented on HDFS-5228: -- Okay, let's commit the patch for the moment. We could revert the recent Symlink work later if we found more problems. For the record, I did hear some performance issues but not yet sure about the cause of them. The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776175#comment-13776175 ] Hudson commented on HDFS-5228: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4461 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4461/]) HDFS-5228. The RemoteIterator returned by DistributedFileSystem.listFiles may throw NullPointerException. Contributed by szetszwo and cnauroth (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525828) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5222: - Attachment: h5222_20130824.patch Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch, h5222_20130824.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776190#comment-13776190 ] Tsz Wo (Nicholas), SZE commented on HDFS-5222: -- Thanks Arpit and Junping for reviewing the patch. ... couple of spurious newlines. ... I intentionally added the new lines to separate the different groups of fields. Shall we rename it to something like: notEnoughSpace? ... You are right that notEnoughSpace is much better. However, I will remove the method due to the comment below. Shall we remove checking node's remaining ... Sure, let's remove it. Here is a new patch: h5222_20130824.patch Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch, h5222_20130824.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5228: - Resolution: Fixed Fix Version/s: 2.1.1-beta Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed this. Thanks, Chris! The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.1-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5139) Remove redundant -R option from setrep
[ https://issues.apache.org/jira/browse/HDFS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776197#comment-13776197 ] Hudson commented on HDFS-5139: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5139. Remove redundant -R option from setrep (update CHANGES.txt). (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525665) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5139. Remove redundant -R option from setrep. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525659) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml Remove redundant -R option from setrep -- Key: HDFS-5139 URL: https://issues.apache.org/jira/browse/HDFS-5139 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0, 1.3.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 2.2.0 Attachments: HDFS-5139.01.patch, HDFS-5139.02.patch, HDFS-5139.03.patch, HDFS-5139.04.patch The -R option to setrep is redundant because it is required for directory targets and ignored for file targets. We can just remove the option and make -R the default for directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776196#comment-13776196 ] Hudson commented on HDFS-5228: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5228. The RemoteIterator returned by DistributedFileSystem.listFiles may throw NullPointerException. Contributed by szetszwo and cnauroth (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525828) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.1-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5239) Allow FSNamesystem lock fairness to be configurable
[ https://issues.apache.org/jira/browse/HDFS-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776195#comment-13776195 ] Hudson commented on HDFS-5239: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5239. Allow FSNamesystem lock fairness to be configurable (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525624) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java Allow FSNamesystem lock fairness to be configurable --- Key: HDFS-5239 URL: https://issues.apache.org/jira/browse/HDFS-5239 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5239.patch The fairness of the {{FSNamesystem#fsLock}} is hardcoded to {{true}}. Using an unfair lock provides a negligible increase to throughput. However this is due to bottlenecks elsewhere in the system. In combination with other changes, such as RPC layer and audit logging, preliminary tests show up to a 5X improvement for a read heavy workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5249) Fix dumper thread which may die silently
[ https://issues.apache.org/jira/browse/HDFS-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776194#comment-13776194 ] Hudson commented on HDFS-5249: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5249. Fix dumper thread which may die silently. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525770) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix dumper thread which may die silently Key: HDFS-5249 URL: https://issues.apache.org/jira/browse/HDFS-5249 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.1.1-beta Attachments: HDFS-5249.02.patch, HDFS-5249.patch Dumper thread can get an NPE when the WriteCtx it's about to work on is just deleted by write back thread. A dead dumper thread could cause out-of-memory error when too many pending writes accumulated for one opened file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4971) Move IO operations out of locking in OpenFileCtx
[ https://issues.apache.org/jira/browse/HDFS-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776198#comment-13776198 ] Hudson commented on HDFS-4971: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-4971. Move IO operations out of locking in OpenFileCtx. Contributed by Jing Zhao and Brandon Li. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525681) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestOffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Move IO operations out of locking in OpenFileCtx Key: HDFS-4971 URL: https://issues.apache.org/jira/browse/HDFS-4971 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.1.1-beta Attachments: HDFS-4971.000.patch, HDFS-4971.001.patch, HDFS-4971.002.patch, HDFS-4971.003.patch, HDFS-4971.004.patch, HDFS-4971.005.patch, HDFS-4971.006.patch, HDFS-4971.007.patch Currently some IO operations (such as writing data to HDFS and dumping to local disk) in OpenFileCtx may hold a lock which can block processing incoming writing requests. This jira aims to optimize OpenFileCtx and move the IO operations out of the locking. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5240) Separate formatting from logging in the audit logger API
[ https://issues.apache.org/jira/browse/HDFS-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776199#comment-13776199 ] Hudson commented on HDFS-5240: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5240. Separate formatting from logging in the audit logger API (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525626) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Separate formatting from logging in the audit logger API Key: HDFS-5240 URL: https://issues.apache.org/jira/browse/HDFS-5240 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5240.patch The audit logger API should be extended (in a compatible manner) to separate the formatting of the log message from the actual logging of the message. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5251) Race between the initialization of NameNode and the http server
[ https://issues.apache.org/jira/browse/HDFS-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776200#comment-13776200 ] Hudson commented on HDFS-5251: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #342 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/342/]) HDFS-5251. Race between the initialization of NameNode and the http server. Contributed by Haohui Mai. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525787) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeJspHelper.java Race between the initialization of NameNode and the http server --- Key: HDFS-5251 URL: https://issues.apache.org/jira/browse/HDFS-5251 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.2.0 Attachments: HDFS-5251.000.patch The constructor of NameNode starts a HTTP server before the FSNameSystem is initialized. Currently there is a race where the HTTP server can access the uninitialized namesystem variable, throwing a NullPointerException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5249) Fix dumper thread which may die silently
[ https://issues.apache.org/jira/browse/HDFS-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776281#comment-13776281 ] Hudson commented on HDFS-5249: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5249. Fix dumper thread which may die silently. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525770) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix dumper thread which may die silently Key: HDFS-5249 URL: https://issues.apache.org/jira/browse/HDFS-5249 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.1.1-beta Attachments: HDFS-5249.02.patch, HDFS-5249.patch Dumper thread can get an NPE when the WriteCtx it's about to work on is just deleted by write back thread. A dead dumper thread could cause out-of-memory error when too many pending writes accumulated for one opened file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4971) Move IO operations out of locking in OpenFileCtx
[ https://issues.apache.org/jira/browse/HDFS-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776285#comment-13776285 ] Hudson commented on HDFS-4971: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-4971. Move IO operations out of locking in OpenFileCtx. Contributed by Jing Zhao and Brandon Li. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525681) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestOffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Move IO operations out of locking in OpenFileCtx Key: HDFS-4971 URL: https://issues.apache.org/jira/browse/HDFS-4971 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.1.1-beta Attachments: HDFS-4971.000.patch, HDFS-4971.001.patch, HDFS-4971.002.patch, HDFS-4971.003.patch, HDFS-4971.004.patch, HDFS-4971.005.patch, HDFS-4971.006.patch, HDFS-4971.007.patch Currently some IO operations (such as writing data to HDFS and dumping to local disk) in OpenFileCtx may hold a lock which can block processing incoming writing requests. This jira aims to optimize OpenFileCtx and move the IO operations out of the locking. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5251) Race between the initialization of NameNode and the http server
[ https://issues.apache.org/jira/browse/HDFS-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776287#comment-13776287 ] Hudson commented on HDFS-5251: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5251. Race between the initialization of NameNode and the http server. Contributed by Haohui Mai. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525787) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeJspHelper.java Race between the initialization of NameNode and the http server --- Key: HDFS-5251 URL: https://issues.apache.org/jira/browse/HDFS-5251 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.2.0 Attachments: HDFS-5251.000.patch The constructor of NameNode starts a HTTP server before the FSNameSystem is initialized. Currently there is a race where the HTTP server can access the uninitialized namesystem variable, throwing a NullPointerException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776283#comment-13776283 ] Hudson commented on HDFS-5228: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5228. The RemoteIterator returned by DistributedFileSystem.listFiles may throw NullPointerException. Contributed by szetszwo and cnauroth (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525828) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.1-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5239) Allow FSNamesystem lock fairness to be configurable
[ https://issues.apache.org/jira/browse/HDFS-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776282#comment-13776282 ] Hudson commented on HDFS-5239: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5239. Allow FSNamesystem lock fairness to be configurable (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525624) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java Allow FSNamesystem lock fairness to be configurable --- Key: HDFS-5239 URL: https://issues.apache.org/jira/browse/HDFS-5239 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5239.patch The fairness of the {{FSNamesystem#fsLock}} is hardcoded to {{true}}. Using an unfair lock provides a negligible increase to throughput. However this is due to bottlenecks elsewhere in the system. In combination with other changes, such as RPC layer and audit logging, preliminary tests show up to a 5X improvement for a read heavy workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5139) Remove redundant -R option from setrep
[ https://issues.apache.org/jira/browse/HDFS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776284#comment-13776284 ] Hudson commented on HDFS-5139: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5139. Remove redundant -R option from setrep (update CHANGES.txt). (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525665) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5139. Remove redundant -R option from setrep. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525659) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml Remove redundant -R option from setrep -- Key: HDFS-5139 URL: https://issues.apache.org/jira/browse/HDFS-5139 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0, 1.3.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 2.2.0 Attachments: HDFS-5139.01.patch, HDFS-5139.02.patch, HDFS-5139.03.patch, HDFS-5139.04.patch The -R option to setrep is redundant because it is required for directory targets and ignored for file targets. We can just remove the option and make -R the default for directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5240) Separate formatting from logging in the audit logger API
[ https://issues.apache.org/jira/browse/HDFS-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776286#comment-13776286 ] Hudson commented on HDFS-5240: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1558 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1558/]) HDFS-5240. Separate formatting from logging in the audit logger API (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525626) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Separate formatting from logging in the audit logger API Key: HDFS-5240 URL: https://issues.apache.org/jira/browse/HDFS-5240 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5240.patch The audit logger API should be extended (in a compatible manner) to separate the formatting of the log message from the actual logging of the message. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5239) Allow FSNamesystem lock fairness to be configurable
[ https://issues.apache.org/jira/browse/HDFS-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776294#comment-13776294 ] Hudson commented on HDFS-5239: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5239. Allow FSNamesystem lock fairness to be configurable (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525624) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java Allow FSNamesystem lock fairness to be configurable --- Key: HDFS-5239 URL: https://issues.apache.org/jira/browse/HDFS-5239 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5239.patch The fairness of the {{FSNamesystem#fsLock}} is hardcoded to {{true}}. Using an unfair lock provides a negligible increase to throughput. However this is due to bottlenecks elsewhere in the system. In combination with other changes, such as RPC layer and audit logging, preliminary tests show up to a 5X improvement for a read heavy workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776295#comment-13776295 ] Hudson commented on HDFS-5228: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5228. The RemoteIterator returned by DistributedFileSystem.listFiles may throw NullPointerException. Contributed by szetszwo and cnauroth (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525828) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.1-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5240) Separate formatting from logging in the audit logger API
[ https://issues.apache.org/jira/browse/HDFS-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776298#comment-13776298 ] Hudson commented on HDFS-5240: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5240. Separate formatting from logging in the audit logger API (daryn) (daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525626) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Separate formatting from logging in the audit logger API Key: HDFS-5240 URL: https://issues.apache.org/jira/browse/HDFS-5240 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 3.0.0, 2.3.0 Attachments: HDFS-5240.patch The audit logger API should be extended (in a compatible manner) to separate the formatting of the log message from the actual logging of the message. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5139) Remove redundant -R option from setrep
[ https://issues.apache.org/jira/browse/HDFS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776296#comment-13776296 ] Hudson commented on HDFS-5139: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5139. Remove redundant -R option from setrep (update CHANGES.txt). (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525665) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5139. Remove redundant -R option from setrep. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525659) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml Remove redundant -R option from setrep -- Key: HDFS-5139 URL: https://issues.apache.org/jira/browse/HDFS-5139 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0, 1.3.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 2.2.0 Attachments: HDFS-5139.01.patch, HDFS-5139.02.patch, HDFS-5139.03.patch, HDFS-5139.04.patch The -R option to setrep is redundant because it is required for directory targets and ignored for file targets. We can just remove the option and make -R the default for directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5249) Fix dumper thread which may die silently
[ https://issues.apache.org/jira/browse/HDFS-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776293#comment-13776293 ] Hudson commented on HDFS-5249: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5249. Fix dumper thread which may die silently. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525770) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix dumper thread which may die silently Key: HDFS-5249 URL: https://issues.apache.org/jira/browse/HDFS-5249 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.1.1-beta Attachments: HDFS-5249.02.patch, HDFS-5249.patch Dumper thread can get an NPE when the WriteCtx it's about to work on is just deleted by write back thread. A dead dumper thread could cause out-of-memory error when too many pending writes accumulated for one opened file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4971) Move IO operations out of locking in OpenFileCtx
[ https://issues.apache.org/jira/browse/HDFS-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776297#comment-13776297 ] Hudson commented on HDFS-4971: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-4971. Move IO operations out of locking in OpenFileCtx. Contributed by Jing Zhao and Brandon Li. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525681) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestOffsetRange.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Move IO operations out of locking in OpenFileCtx Key: HDFS-4971 URL: https://issues.apache.org/jira/browse/HDFS-4971 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.1.1-beta Attachments: HDFS-4971.000.patch, HDFS-4971.001.patch, HDFS-4971.002.patch, HDFS-4971.003.patch, HDFS-4971.004.patch, HDFS-4971.005.patch, HDFS-4971.006.patch, HDFS-4971.007.patch Currently some IO operations (such as writing data to HDFS and dumping to local disk) in OpenFileCtx may hold a lock which can block processing incoming writing requests. This jira aims to optimize OpenFileCtx and move the IO operations out of the locking. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5251) Race between the initialization of NameNode and the http server
[ https://issues.apache.org/jira/browse/HDFS-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776299#comment-13776299 ] Hudson commented on HDFS-5251: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1532 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1532/]) HDFS-5251. Race between the initialization of NameNode and the http server. Contributed by Haohui Mai. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525787) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeJspHelper.java Race between the initialization of NameNode and the http server --- Key: HDFS-5251 URL: https://issues.apache.org/jira/browse/HDFS-5251 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.2.0 Attachments: HDFS-5251.000.patch The constructor of NameNode starts a HTTP server before the FSNameSystem is initialized. Currently there is a race where the HTTP server can access the uninitialized namesystem variable, throwing a NullPointerException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776291#comment-13776291 ] Junping Du commented on HDFS-5222: -- Thanks Nicholas for addressing these comments. The new patch looks good to me except a tiny bug in original code that you are moving: {code} + /** Adjusts curr and prev number of blocks scheduled every few minutes. */ + private void rollBlocksScheduled(long now) { +if (now - lastBlocksScheduledRollTime BLOCKS_SCHEDULED_ROLL_INTERVAL) { + prevApproxBlocksScheduled = currApproxBlocksScheduled; + currApproxBlocksScheduled = 0; + lastBlocksScheduledRollTime = now; +} + } {code} It should be prevApproxBlocksScheduled += currApproxBlocksScheduled;. Isn't it? +1 once this comments are addressed. :) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch, h5222_20130824.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4989) Balancer needs to consider storage type in balancing
[ https://issues.apache.org/jira/browse/HDFS-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776307#comment-13776307 ] Junping Du commented on HDFS-4989: -- Hi [~szetszwo] and [~arpitagarwal], which way you prefer here? moving DatanodeStorageInfo to o.a.h.hdfs.protocol or creating something brand new to represent some important info of DatanodeStorageInfo (like DatanodeInfo for DatanodeDescriptor)? Balancer needs to consider storage type in balancing Key: HDFS-4989 URL: https://issues.apache.org/jira/browse/HDFS-4989 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, datanode, namenode Reporter: Suresh Srinivas Assignee: Junping Du Balancer needs to balance with in a storage tier. Also needs an option to balance only a specific storage type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4989) Balancer needs to consider storage type in balancing
[ https://issues.apache.org/jira/browse/HDFS-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-4989: - Description: Balancer needs to balance with in a storage tier. A couple of works: - add an option of storageType in balancer CLI to balance data on a specific storage type. - add storageType to BalancingPolicy - need new API (getDatanodeReport(StorageType) peer with getDatanodeReport() before) in ClientProtocol to get storage related info from NN so that balancer can choose storages as source, target balancing pair. - update chooseDatanodes() algorithm to choose storages on the same DN first in setting up under-utilization, over-load pairs. was:Balancer needs to balance with in a storage tier. Also needs an option to balance only a specific storage type. Balancer needs to consider storage type in balancing Key: HDFS-4989 URL: https://issues.apache.org/jira/browse/HDFS-4989 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, datanode, namenode Reporter: Suresh Srinivas Assignee: Junping Du Balancer needs to balance with in a storage tier. A couple of works: - add an option of storageType in balancer CLI to balance data on a specific storage type. - add storageType to BalancingPolicy - need new API (getDatanodeReport(StorageType) peer with getDatanodeReport() before) in ClientProtocol to get storage related info from NN so that balancer can choose storages as source, target balancing pair. - update chooseDatanodes() algorithm to choose storages on the same DN first in setting up under-utilization, over-load pairs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776341#comment-13776341 ] Tsz Wo (Nicholas), SZE commented on HDFS-5222: -- I believe prevApproxBlocksScheduled = currApproxBlocksScheduled; is intentional but not a bug. That's why it calls rollBlocksScheduled. Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch, h5222_20130824.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5222) Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776380#comment-13776380 ] Junping Du commented on HDFS-5222: -- Nicholas, Thanks for your reminder. I didn't think too much on roll before as I was thinking to move count from currApproxBlocksScheduled to (and plus) prevApproxBlocksScheduled is also a roll. However, I get some hints from Approx which means not precisely as caused by error cases to make these values slightly inconsistency. Is that the reason why we periodically cleanup prevApproxBlocksScheduled (and the main purpose for separating counters to pre and cur)? If so, that make sense to me. Move block schedule information from DatanodeDescriptor to DatanodeStorageInfo -- Key: HDFS-5222 URL: https://issues.apache.org/jira/browse/HDFS-5222 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h5222_20130819.patch, h5222_20130824.patch In HDFS-4990, the block placement target type was changed from DatanodeDescriptor to DatanodeStorageInfo. The block schedule information, such as the number of blocks scheduled for replication (i.e. getBlocksScheduled()), should be moved from DatanodeDescriptor to DatanodeStorageInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776521#comment-13776521 ] Jinghui Wang commented on HDFS-5246: Sure. I can address the issue for RpcProgramMountd in this JIRA since the constructor of that class already takes in a Configuration object, so the change should be minimal. Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.1-beta Attachments: HDFS-5246-2.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5191: --- Attachment: HDFS-5191-caching.009.patch revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens
[ https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776547#comment-13776547 ] Colin Patrick McCabe commented on HDFS-5208: {code} - dnsToSwitchMapping.reloadCachedMappings(); + ListString invalidNodeNames = new ArrayListString(1); + // clear cache for nodes in IP or Hostname + invalidNodeNames.add(nodeReg.getIpAddr()); + invalidNodeNames.add(nodeReg.getHostName()); + dnsToSwitchMapping.reloadCachedMappings(invalidNodeNames); {code} Can we also add something like this? {code} + invalidNodeNames.add(nodeReg.getPeerHostName()); {code} It seems like the datanode could be known by any one of those three: IP address, registration name, or hostname. After that change, I don't see any reason why this shouldn't work. What kind of testing have you done? Only clear network location cache on specific nodes if invalid NetworkTopology happens -- Key: HDFS-5208 URL: https://issues.apache.org/jira/browse/HDFS-5208 Project: Hadoop HDFS Issue Type: Improvement Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-5208-v1.patch After HDFS-4521, once a DN is registered with invalid networktopology, all cached rack info in DNSToSwitchMapping will be cleared. We should only clear cache on specific nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776536#comment-13776536 ] Colin Patrick McCabe commented on HDFS-5191: bq. [import-only changes in DFSUtil.java] fixed bq. ...exception message would have an extraneous space fixed bq. Should we use GenericTestUtils#waitFor, so that there is a timeout? fixed to use waitFor here. even if there is an overall test timeout, it's nicer to use GenericTestUtils#waitFor. I also added a comment to IdentityHashStore about the 0.50 load factor, and got rid of the extra slot, so hash table size is now always a power of 2. revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776560#comment-13776560 ] Chris Nauroth commented on HDFS-5191: - Great, thank you for incorporating the feedback. bq. even if there is an overall test timeout, it's nicer to use GenericTestUtils#waitFor. Agreed, it helps narrow the problem down to a specific point in the test code. Thanks for making that change. libhdfs looks good too. Just a couple of minor things: {code} * You must free all options structures allocated with this function using * readZeroOptionsFree. {code} Change readZeroOptionsFree to hadoopRzOptionsFree. {code} * This buffer will continue to be valid and readable * until it is released by readZeroBufferFree. Failure to {code} Change readZeroBufferFree to hadoopRzBufferFree. {code} hadoopRzOptionsClearCached(env, opts); opts-cachedEnumSet = NULL; {code} Setting {{cachedEnumSet}} to {{NULL}} isn't required here, because {{hadoopRzOptionsClearCached}} already does it. I'll be +1 for the whole patch after this. I'm also interested in getting the zero-copy API and client-side mmap'ing changes merged to trunk and branch-2 (but not the additional datanode caching/mlock'ing that's in the HDFS-4949 branch). This will make it easier for clients like ORC to start coding against the APIs. If necessary, I can volunteer to put together a merge patch with just the zero-copy changes. revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5237) Get rid of nodes' registration names
[ https://issues.apache.org/jira/browse/HDFS-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776568#comment-13776568 ] Colin Patrick McCabe commented on HDFS-5237: I'm not so sure that we need to remove this, now. If it's heavily used in unit tests, and also used by end-users in certain configurations, it's going to be an uphill battle. Validating it *sounds* like a good idea, but in practice it's going to add up to failing DN registrations if the reported hostname can't be resolved, which doesn't seem that friendly. DNS is sometimes unreliable like any other network service, after all. Why don't we add some JavaDoc to DatanodeID explaining the reasons why hostName may in fact not be a real host name? I also wonder why we need {{DatanodeID#peerHostName}}. It doesn't seem to be used anywhere, and it definitely muddies the waters in that class. Get rid of nodes' registration names Key: HDFS-5237 URL: https://issues.apache.org/jira/browse/HDFS-5237 Project: Hadoop HDFS Issue Type: Bug Reporter: Junping Du Fix For: 3.0.0 Attachments: HDFS-5237.patch Per discussion in HDFS-5208 and may be some other discussions before, Node's registration name is pretty confusing and shouldn't be used in production environment as topology resolving issues. So we remove related configuration dfs.datanode.hostname or its old one slave.host.name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776596#comment-13776596 ] Jinghui Wang commented on HDFS-5246: HDFS-5246-3.patch attached. Making RpcProgramMountd port number configurable. Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.1-beta Attachments: HDFS-5246-2.patch, HDFS-5246-3.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinghui Wang updated HDFS-5246: --- Attachment: HDFS-5246-3.patch Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.1-beta Attachments: HDFS-5246-2.patch, HDFS-5246-3.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776619#comment-13776619 ] Chris Nauroth commented on HDFS-5191: - I forgot one more thing. There is still an {{UnsupportedOperationException}} thrown in the fallback fallback case. The reason is that in {{ByteBufferUtil#fallbackRead}}, the {{stream}} argument can be an {{FSDataInputStream}}. This class implements {{ByteBufferReadable}}, so the calculation of {{useDirect}} is always true, even if the underlying stream inside the {{FSDataInputStream}} doesn't support it. Here is a potential change that fixes it. With this in place, we fall through to the array-copying code path, and I don't see the {{UnsupportedOperationException}}. Colin, do you want to incorporate this into the patch (or something like it)? {code} final boolean useDirect; if (stream instanceof FSDataInputStream) { FSDataInputStream fsdis = (FSDataInputStream)stream; useDirect = fsdis.getWrappedStream() instanceof ByteBufferReadable; } else { useDirect = stream instanceof ByteBufferReadable; } {code} revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776641#comment-13776641 ] Hadoop QA commented on HDFS-5246: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604852/HDFS-5246-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5027//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5027//console This message is automatically generated. Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.1-beta Attachments: HDFS-5246-2.patch, HDFS-5246-3.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776686#comment-13776686 ] Colin Patrick McCabe commented on HDFS-5191: bq. Change readZeroOptionsFree to hadoopRzOptionsFree. fixed this and others bq. Setting cachedEnumSet to NULL isn't required here, because hadoopRzOptionsClearCached already does it. removed bq. There is still an UnsupportedOperationException thrown in the fallback fallback case... Fixed, based on your suggestion. Thanks. revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5191: --- Attachment: HDFS-5191-caching.010.patch revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5167) Add metrics about the NameNode retry cache
[ https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5167: Assignee: Tsuyoshi OZAWA Add metrics about the NameNode retry cache -- Key: HDFS-5167 URL: https://issues.apache.org/jira/browse/HDFS-5167 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: HDFS-5167.1.patch, HDFS-5167.2.patch, HDFS-5167.3.patch, HDFS-5167.4.patch, HDFS-5167.5.patch It will be helpful to have metrics in NameNode about the retry cache, such as the retry count etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5250) Enable one url to browse the entire federated hdfs file system on dfsclusterhealth summary page
[ https://issues.apache.org/jira/browse/HDFS-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776694#comment-13776694 ] Lohit Vijayarenu commented on HDFS-5250: Agreed that ViewFileSystem is client side mount, but that was a way of getting this solved. If that is really ugly, is there any other clean way to get this fixed? Many people use browse filesystem and it would be good to know if there is someone who have thought about this in a cleaner way. Have you thought about any higher level projects like Amabari implementing this in future may be? Enable one url to browse the entire federated hdfs file system on dfsclusterhealth summary page --- Key: HDFS-5250 URL: https://issues.apache.org/jira/browse/HDFS-5250 Project: Hadoop HDFS Issue Type: New Feature Reporter: Vrushali C Ideally, we should have one url to browse the entire federated file system on the main dfsclusterhealth summary page along with the list of namenodes in the cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5191: Hadoop Flags: Reviewed +1 for version 10 of the patch. Colin, thank you very much for addressing all of the feedback. revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5255) Distcp job fails with hsftp when https is enabled
Yesha Vora created HDFS-5255: Summary: Distcp job fails with hsftp when https is enabled Key: HDFS-5255 URL: https://issues.apache.org/jira/browse/HDFS-5255 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Run Distcp job using hsftp when ssl is enabled. The job fails with java.net.SocketException: Unexpected end of file from server Error Running: hadoop distcp hsftp://localhost:50070/f1 hdfs://localhost:19000/f5 All the tasks fails with below error. 13/09/23 15:52:38 INFO mapreduce.Job: Task Id : attempt_1379976241507_0004_m_00_0, Status : FAILED Error: java.io.IOException: File copy failed: hsftp://localhost:50070/f1 -- hdfs://localhost:19000/f5 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hsftp://127.0.0.1:50070/f1 to hdfs://localhost:19000/f5 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:233) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:198) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:134) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:101) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:383) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:75) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:230) ... 16 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4817: --- Attachment: HDFS-4817-b2.1.001.patch backport to branch-2.1-beta make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.3.0 Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, HDFS-4817.004.patch, HDFS-4817.006.patch, HDFS-4817.007.patch, HDFS-4817.008.patch, HDFS-4817.009.patch, HDFS-4817.010.patch, HDFS-4817-b2.1.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776760#comment-13776760 ] Colin Patrick McCabe commented on HDFS-4817: by the way, this backport was clean.. the only merge conflict was in the hdfs-defaults.xml file. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.3.0 Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, HDFS-4817.004.patch, HDFS-4817.006.patch, HDFS-4817.007.patch, HDFS-4817.008.patch, HDFS-4817.009.patch, HDFS-4817.010.patch, HDFS-4817-b2.1.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-5191. Resolution: Fixed Fix Version/s: HDFS-4949 committed to HDFS-4949 branch. thanks all revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5217) Namenode log directory link is inaccessible in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5217: Status: Patch Available (was: Open) Namenode log directory link is inaccessible in secure cluster - Key: HDFS-5217 URL: https://issues.apache.org/jira/browse/HDFS-5217 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-5217.000.patch Currently in a secured HDFS cluster, 401 error is returned when clicking the NameNode Logs link. Looks like the cause of the issue is that the httpServer does not correctly set the security handler and the user realm currently, which causes the httpRequest.getRemoteUser (for the log URL) to return null and later be overwritten to the default web name (e.g., dr.who) by the filter. In the meanwhile, in a secured cluster the log URL requires the http user to be an administrator. That's why we see the 401 error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again
[ https://issues.apache.org/jira/browse/HDFS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5225: Priority: Blocker (was: Major) datanode keeps logging the same 'is no longer in the dataset' message over and over again - Key: HDFS-5225 URL: https://issues.apache.org/jira/browse/HDFS-5225 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Priority: Blocker Attachments: HDFS-5225.1.patch, HDFS-5225-reproduce.1.txt I was running the usual Bigtop testing on 2.1.1-beta RC1 with the following configuration: 4 nodes fully distributed cluster with security on. All of a sudden my DN ate up all of the space in /var/log logging the following message repeatedly: {noformat} 2013-09-18 20:51:12,046 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369 is no longer in the dataset {noformat} It wouldn't answer to a jstack and jstack -F ended up being useless. Here's what I was able to find in the NameNode logs regarding this block ID: {noformat} fgrep -rI 'blk_1073742189' hadoop-hdfs-namenode-ip-10-224-158-152.log 2013-09-18 18:03:16,972 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/jenkins/testAppendInputWedSep18180222UTC2013/test4.fileWedSep18180222UTC2013._COPYING_. BP-1884637155-10.224.158.152-1379524544853 blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.158.152:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.34.74.206:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,224 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.83.107.80:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,899 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369, newGenerationStamp=1370, newLength=1048576, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:17,904 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370 2013-09-18 18:03:18,540 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370, newGenerationStamp=1371, newLength=2097152, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:18,548 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1371 2013-09-18 18:03:26,150 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073742189_1371 10.83.107.80:1004 10.34.74.206:1004 10.224.158.152:1004 2013-09-18 18:03:26,847 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.34.74.206:1004 to delete [blk_1073742178_1359, blk_1073742183_1362, blk_1073742184_1363, blk_1073742186_1366, blk_1073742188_1368, blk_1073742189_1371] 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.224.158.152:1004 to delete [blk_1073742177_1353, blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, blk_1073742181_1358, blk_1073742182_1361, blk_1073742185_1364, blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
[jira] [Updated] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again
[ https://issues.apache.org/jira/browse/HDFS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5225: Target Version/s: 2.1.2-beta datanode keeps logging the same 'is no longer in the dataset' message over and over again - Key: HDFS-5225 URL: https://issues.apache.org/jira/browse/HDFS-5225 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Priority: Blocker Attachments: HDFS-5225.1.patch, HDFS-5225-reproduce.1.txt I was running the usual Bigtop testing on 2.1.1-beta RC1 with the following configuration: 4 nodes fully distributed cluster with security on. All of a sudden my DN ate up all of the space in /var/log logging the following message repeatedly: {noformat} 2013-09-18 20:51:12,046 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369 is no longer in the dataset {noformat} It wouldn't answer to a jstack and jstack -F ended up being useless. Here's what I was able to find in the NameNode logs regarding this block ID: {noformat} fgrep -rI 'blk_1073742189' hadoop-hdfs-namenode-ip-10-224-158-152.log 2013-09-18 18:03:16,972 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/jenkins/testAppendInputWedSep18180222UTC2013/test4.fileWedSep18180222UTC2013._COPYING_. BP-1884637155-10.224.158.152-1379524544853 blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.158.152:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.34.74.206:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,224 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.83.107.80:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,899 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369, newGenerationStamp=1370, newLength=1048576, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:17,904 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370 2013-09-18 18:03:18,540 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370, newGenerationStamp=1371, newLength=2097152, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:18,548 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1371 2013-09-18 18:03:26,150 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073742189_1371 10.83.107.80:1004 10.34.74.206:1004 10.224.158.152:1004 2013-09-18 18:03:26,847 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.34.74.206:1004 to delete [blk_1073742178_1359, blk_1073742183_1362, blk_1073742184_1363, blk_1073742186_1366, blk_1073742188_1368, blk_1073742189_1371] 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.224.158.152:1004 to delete [blk_1073742177_1353, blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, blk_1073742181_1358, blk_1073742182_1361, blk_1073742185_1364, blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
[jira] [Updated] (HDFS-5217) Namenode log directory link is inaccessible in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-5217: -- Priority: Major (was: Minor) Namenode log directory link is inaccessible in secure cluster - Key: HDFS-5217 URL: https://issues.apache.org/jira/browse/HDFS-5217 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5217.000.patch Currently in a secured HDFS cluster, 401 error is returned when clicking the NameNode Logs link. Looks like the cause of the issue is that the httpServer does not correctly set the security handler and the user realm currently, which causes the httpRequest.getRemoteUser (for the log URL) to return null and later be overwritten to the default web name (e.g., dr.who) by the filter. In the meanwhile, in a secured cluster the log URL requires the http user to be an administrator. That's why we see the 401 error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5228: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.2-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5255) Distcp job fails with hsftp when https is enabled
[ https://issues.apache.org/jira/browse/HDFS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776859#comment-13776859 ] Arpit Agarwal commented on HDFS-5255: - Tested distcp with and without https ({{HftpFileSystem}} and {{HsftpFileSystem}}). Distcp job fails with hsftp when https is enabled - Key: HDFS-5255 URL: https://issues.apache.org/jira/browse/HDFS-5255 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Assignee: Arpit Agarwal Attachments: HDFS-5255.01.patch Run Distcp job using hsftp when ssl is enabled. The job fails with java.net.SocketException: Unexpected end of file from server Error Running: hadoop distcp hsftp://localhost:50070/f1 hdfs://localhost:19000/f5 All the tasks fails with below error. 13/09/23 15:52:38 INFO mapreduce.Job: Task Id : attempt_1379976241507_0004_m_00_0, Status : FAILED Error: java.io.IOException: File copy failed: hsftp://localhost:50070/f1 -- hdfs://localhost:19000/f5 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hsftp://127.0.0.1:50070/f1 to hdfs://localhost:19000/f5 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:233) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:198) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:134) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:101) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:383) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:75) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:230) ... 16 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5247) Namenode should close editlog and unlock storage when removing failed storage dir
[ https://issues.apache.org/jira/browse/HDFS-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776860#comment-13776860 ] Suresh Srinivas commented on HDFS-5247: --- Some comments: bq. When one of dfs.name.dir failed, namenode didn't close editlog and unlock the storage: Are you saying for the failed directory, namenode didn't close editlog and unlock the storage or is it for all the storage directories? If it is for the failed storage, if failure is due to a disk going bad, in most of the cases, we may not be able to do these cleanups. Namenode should close editlog and unlock storage when removing failed storage dir - Key: HDFS-5247 URL: https://issues.apache.org/jira/browse/HDFS-5247 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.1 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 1.2.1 Attachments: HDFS-5247-branch-1.2.patch When one of dfs.name.dir failed, namenode didn't close editlog and unlock the storage: java24764 hadoop 78uW REG 252,320 393219 /volume1/nn/dfs/in_use.lock (deleted) java24764 hadoop 107u REG 252,32 1155072 393229 /volume1/nn/dfs/current/edits.new (deleted) java24764 hadoop 119u REG 252,320 393238 /volume1/nn/dfs/current/fstime.tmp java24764 hadoop 140u REG 252,32 1761805 393239 /volume1/nn/dfs/current/edits If this dir is limit of space, then restore this storage may fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5241) Provide alternate queuing audit logger to reduce logging contention
[ https://issues.apache.org/jira/browse/HDFS-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-5241: -- Attachment: HDFS-5241.patch No tests, requesting feedback before investing the time. Provides an option to enable async logging via a single background thread. The performance gains are impressive under an ideal read heavy load: * fair lock = 26k op/s * unfair lock = 58k op/s * unfair lock + unbuffered appender = 120k ops/sec A single thread consuming log messages from a queue populated by the 100 rpc handlers is sufficient to improve performance. Additional threads showed no significant improvement. The problem is 100 threads colliding on log4j's synch'ed method. The contention is so high and the logging call takes enough time, that the thread's futex has to call into the kernel. The context switch and rescheduling wait ruins performance. By comparison, the time spent waiting to add a log message to the queue is negligible. The futexes stay in userland. The performance sweet spot is a queue sized to the number of handlers. As long as the background thread can log messages faster than a handler can process the next call, the handler is guaranteed a spot in the queue w/o a context switch. It's a configurable undocumented option for now since the audit log becomes prone to data loss and slight offset of timestamps. The call queue tends to run relatively dry so I expect my other connection handling patches like HADOOP-9956 will have a larger impact. Provide alternate queuing audit logger to reduce logging contention --- Key: HDFS-5241 URL: https://issues.apache.org/jira/browse/HDFS-5241 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5241.patch The default audit logger has extremely poor performance. The internal synchronization of log4j causes massive contention between the call handlers (100 by default) which drastically limits the throughput of the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5241) Provide alternate queuing audit logger to reduce logging contention
[ https://issues.apache.org/jira/browse/HDFS-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-5241: -- Status: Patch Available (was: Open) Provide alternate queuing audit logger to reduce logging contention --- Key: HDFS-5241 URL: https://issues.apache.org/jira/browse/HDFS-5241 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5241.patch The default audit logger has extremely poor performance. The internal synchronization of log4j causes massive contention between the call handlers (100 by default) which drastically limits the throughput of the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776899#comment-13776899 ] Suresh Srinivas commented on HDFS-5191: --- [~cmccabe], given that this is an enabler for HDFS-4949, but an independent change, can this be committed to trunk and 2.3 release? This helps in this functionality getting more testing, especially from Hive/ORC side. I or [~cnauroth] can do the merges. revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5217) Namenode log directory link is inaccessible in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776907#comment-13776907 ] Hadoop QA commented on HDFS-5217: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603644/HDFS-5217.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5029//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5029//console This message is automatically generated. Namenode log directory link is inaccessible in secure cluster - Key: HDFS-5217 URL: https://issues.apache.org/jira/browse/HDFS-5217 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5217.000.patch Currently in a secured HDFS cluster, 401 error is returned when clicking the NameNode Logs link. Looks like the cause of the issue is that the httpServer does not correctly set the security handler and the user realm currently, which causes the httpRequest.getRemoteUser (for the log URL) to return null and later be overwritten to the default web name (e.g., dr.who) by the filter. In the meanwhile, in a secured cluster the log URL requires the http user to be an administrator. That's why we see the 401 error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5246: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.2-beta Attachments: HDFS-5246-2.patch, HDFS-5246-3.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5244: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order. Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.0-beta, 2.1.2-beta Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5160) MiniDFSCluster webui does not work
[ https://issues.apache.org/jira/browse/HDFS-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5160: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta MiniDFSCluster webui does not work -- Key: HDFS-5160 URL: https://issues.apache.org/jira/browse/HDFS-5160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Fix For: 2.1.2-beta The webui does not work, when going to http://localhost:50070 you get: {code} Directory: / webapps/ 102 bytes Sep 4, 2013 9:32:55 AM {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4754: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 3.0.0, 2.1.2-beta Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5098) Enhance FileSystem.Statistics to have locality information
[ https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5098: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Enhance FileSystem.Statistics to have locality information -- Key: HDFS-5098 URL: https://issues.apache.org/jira/browse/HDFS-5098 Project: Hadoop HDFS Issue Type: Improvement Reporter: Bikas Saha Assignee: Suresh Srinivas Fix For: 2.1.2-beta Currently in MR/Tez we dont have a good and accurate means to detect how much the the IO was actually done locally. Getting this information from the source of truth would be much better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5255) Distcp job fails with hsftp when https is enabled
[ https://issues.apache.org/jira/browse/HDFS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776953#comment-13776953 ] Hadoop QA commented on HDFS-5255: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604879/HDFS-5255.01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestHftpDelegationToken {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5028//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5028//console This message is automatically generated. Distcp job fails with hsftp when https is enabled - Key: HDFS-5255 URL: https://issues.apache.org/jira/browse/HDFS-5255 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Assignee: Arpit Agarwal Attachments: HDFS-5255.01.patch Run Distcp job using hsftp when ssl is enabled. The job fails with java.net.SocketException: Unexpected end of file from server Error Running: hadoop distcp hsftp://localhost:50070/f1 hdfs://localhost:19000/f5 All the tasks fails with below error. 13/09/23 15:52:38 INFO mapreduce.Job: Task Id : attempt_1379976241507_0004_m_00_0, Status : FAILED Error: java.io.IOException: File copy failed: hsftp://localhost:50070/f1 -- hdfs://localhost:19000/f5 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hsftp://127.0.0.1:50070/f1 to hdfs://localhost:19000/f5 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:233) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:198) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:134) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:101) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:383) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) at
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5256: - Attachment: HDFS-5256.000.patch Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5256: - Status: Patch Available (was: Open) Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
Haohui Mai created HDFS-5256: Summary: Use guava LoadingCache to implement DFSClientCache Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5256: - Attachment: HDFS-5256.000.patch Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5256: - Attachment: (was: HDFS-5256.000.patch) Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5217) Namenode log directory link is inaccessible in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776984#comment-13776984 ] Dilli Arumugam commented on HDFS-5217: -- Reading the patch, it appears that you used BASIC auth to authenticate the user. Could you confirm, clarify that? In a secure cluster, the correct thing would be to authenticate the end user with SPNego for WebUI access. Did you follow the instructions at http://hadoop.apache.org/docs/stable/HttpAuthentication.html Namenode log directory link is inaccessible in secure cluster - Key: HDFS-5217 URL: https://issues.apache.org/jira/browse/HDFS-5217 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5217.000.patch Currently in a secured HDFS cluster, 401 error is returned when clicking the NameNode Logs link. Looks like the cause of the issue is that the httpServer does not correctly set the security handler and the user realm currently, which causes the httpRequest.getRemoteUser (for the log URL) to return null and later be overwritten to the default web name (e.g., dr.who) by the filter. In the meanwhile, in a secured cluster the log URL requires the http user to be an administrator. That's why we see the 401 error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776989#comment-13776989 ] Hadoop QA commented on HDFS-5256: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604926/HDFS-5256.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5031//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5031//console This message is automatically generated. Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens
[ https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5208: - Attachment: HDFS-5208-v2.patch Only clear network location cache on specific nodes if invalid NetworkTopology happens -- Key: HDFS-5208 URL: https://issues.apache.org/jira/browse/HDFS-5208 Project: Hadoop HDFS Issue Type: Improvement Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch After HDFS-4521, once a DN is registered with invalid networktopology, all cached rack info in DNSToSwitchMapping will be cleared. We should only clear cache on specific nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5139) Remove redundant -R option from setrep
[ https://issues.apache.org/jira/browse/HDFS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5139: Fix Version/s: (was: 2.2.0) 2.1.2-beta Remove redundant -R option from setrep -- Key: HDFS-5139 URL: https://issues.apache.org/jira/browse/HDFS-5139 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0, 1.3.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 2.1.2-beta Attachments: HDFS-5139.01.patch, HDFS-5139.02.patch, HDFS-5139.03.patch, HDFS-5139.04.patch The -R option to setrep is redundant because it is required for directory targets and ignored for file targets. We can just remove the option and make -R the default for directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5251) Race between the initialization of NameNode and the http server
[ https://issues.apache.org/jira/browse/HDFS-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5251: Fix Version/s: (was: 2.2.0) 2.1.2-beta Race between the initialization of NameNode and the http server --- Key: HDFS-5251 URL: https://issues.apache.org/jira/browse/HDFS-5251 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5251.000.patch The constructor of NameNode starts a HTTP server before the FSNameSystem is initialized. Currently there is a race where the HTTP server can access the uninitialized namesystem variable, throwing a NullPointerException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens
[ https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777004#comment-13777004 ] Junping Du commented on HDFS-5208: -- Thanks Colin for review. v2 patch incorporate your comments. For test, it works well to clean up cache for node (in registration name) registering with fault topology in TestNetworkTopology. Only clear network location cache on specific nodes if invalid NetworkTopology happens -- Key: HDFS-5208 URL: https://issues.apache.org/jira/browse/HDFS-5208 Project: Hadoop HDFS Issue Type: Improvement Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch After HDFS-4521, once a DN is registered with invalid networktopology, all cached rack info in DNSToSwitchMapping will be cleared. We should only clear cache on specific nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5217) Namenode log directory link is inaccessible in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777008#comment-13777008 ] Jing Zhao commented on HDFS-5217: - bq. Reading the patch, it appears that you used BASIC auth to authenticate the user. Yes, actually that's the original behavior. bq. http://hadoop.apache.org/docs/stable/HttpAuthentication.html Thanks for the link! I will follow the instructions and do the test. Namenode log directory link is inaccessible in secure cluster - Key: HDFS-5217 URL: https://issues.apache.org/jira/browse/HDFS-5217 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5217.000.patch Currently in a secured HDFS cluster, 401 error is returned when clicking the NameNode Logs link. Looks like the cause of the issue is that the httpServer does not correctly set the security handler and the user realm currently, which causes the httpRequest.getRemoteUser (for the log URL) to return null and later be overwritten to the default web name (e.g., dr.who) by the filter. In the meanwhile, in a secured cluster the log URL requires the http user to be an administrator. That's why we see the 401 error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777026#comment-13777026 ] Colin Patrick McCabe commented on HDFS-5191: bq. can this be committed to trunk and 2.3 release? Sure. If you prepare a patch which adds zero-copy support to 2.3, I'll review it. (In addition to this patch, you will also have to backport HDFS-4953, either separately or as part of the same backport.) revisit zero-copy API in FSDataInputStream to make it more intuitive Key: HDFS-5191 URL: https://issues.apache.org/jira/browse/HDFS-5191 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, libhdfs Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: HDFS-5191-caching.001.patch, HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch, HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch, HDFS-5191-caching.009.patch, HDFS-5191-caching.010.patch As per the discussion on HDFS-4953, we should revisit the zero-copy API to make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5241) Provide alternate queuing audit logger to reduce logging contention
[ https://issues.apache.org/jira/browse/HDFS-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777054#comment-13777054 ] Hadoop QA commented on HDFS-5241: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604899/HDFS-5241.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5030//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5030//console This message is automatically generated. Provide alternate queuing audit logger to reduce logging contention --- Key: HDFS-5241 URL: https://issues.apache.org/jira/browse/HDFS-5241 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5241.patch The default audit logger has extremely poor performance. The internal synchronization of log4j causes massive contention between the call handlers (100 by default) which drastically limits the throughput of the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5247) Namenode should close editlog and unlock storage when removing failed storage dir
[ https://issues.apache.org/jira/browse/HDFS-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777073#comment-13777073 ] zhaoyunjiong commented on HDFS-5247: I'm saying for the failed directory. Our case is due to no space on that disk. In this case, it need and should close those two files. And I believe try to close won't make thing worse. Namenode should close editlog and unlock storage when removing failed storage dir - Key: HDFS-5247 URL: https://issues.apache.org/jira/browse/HDFS-5247 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.1 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 1.2.1 Attachments: HDFS-5247-branch-1.2.patch When one of dfs.name.dir failed, namenode didn't close editlog and unlock the storage: java24764 hadoop 78uW REG 252,320 393219 /volume1/nn/dfs/in_use.lock (deleted) java24764 hadoop 107u REG 252,32 1155072 393229 /volume1/nn/dfs/current/edits.new (deleted) java24764 hadoop 119u REG 252,320 393238 /volume1/nn/dfs/current/fstime.tmp java24764 hadoop 140u REG 252,32 1761805 393239 /volume1/nn/dfs/current/edits If this dir is limit of space, then restore this storage may fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5255) Distcp job fails with hsftp when https is enabled
[ https://issues.apache.org/jira/browse/HDFS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5255: Attachment: HDFS-5255.02.patch Update the test case. The fix caused the exception type to change. Distcp job fails with hsftp when https is enabled - Key: HDFS-5255 URL: https://issues.apache.org/jira/browse/HDFS-5255 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Assignee: Arpit Agarwal Attachments: HDFS-5255.01.patch, HDFS-5255.02.patch Run Distcp job using hsftp when ssl is enabled. The job fails with java.net.SocketException: Unexpected end of file from server Error Running: hadoop distcp hsftp://localhost:50070/f1 hdfs://localhost:19000/f5 All the tasks fails with below error. 13/09/23 15:52:38 INFO mapreduce.Job: Task Id : attempt_1379976241507_0004_m_00_0, Status : FAILED Error: java.io.IOException: File copy failed: hsftp://localhost:50070/f1 -- hdfs://localhost:19000/f5 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hsftp://127.0.0.1:50070/f1 to hdfs://localhost:19000/f5 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:233) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:198) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:134) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:101) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:383) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:75) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:230) ... 16 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens
[ https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777113#comment-13777113 ] Hadoop QA commented on HDFS-5208: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604930/HDFS-5208-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5032//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5032//console This message is automatically generated. Only clear network location cache on specific nodes if invalid NetworkTopology happens -- Key: HDFS-5208 URL: https://issues.apache.org/jira/browse/HDFS-5208 Project: Hadoop HDFS Issue Type: Improvement Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch After HDFS-4521, once a DN is registered with invalid networktopology, all cached rack info in DNSToSwitchMapping will be cleared. We should only clear cache on specific nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira