[jira] [Commented] (HDFS-4830) Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660549#comment-13660549 ] Hudson commented on HDFS-4830: -- Integrated in Hadoop-Yarn-trunk #212 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/212/]) HDFS-4830. Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml. Contributed by Aaron T. Myers. (Revision 1483603) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483603 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/AvailableSpaceVolumeChoosingPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/TestAvailableSpaceVolumeChoosingPolicy.java Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml -- Key: HDFS-4830 URL: https://issues.apache.org/jira/browse/HDFS-4830 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-beta Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.0.5-beta Attachments: HDFS-4830.patch, HDFS-4830.patch In hdfs-default.xml we have these two settings: {noformat} dfs.datanode.fsdataset.volume.choosing.balanced-space-threshold dfs.datanode.fsdataset.volume.choosing.balanced-space-preference-percent {noformat} But in fact they should be these, from DFSConfigKeys.java: {noformat} dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-percent {noformat} This won't actually affect any functionality, since default values are used in the code anyway, but makes the documentation generated from hdfs-default.xml inaccurate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4824) FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner
[ https://issues.apache.org/jira/browse/HDFS-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660551#comment-13660551 ] Hudson commented on HDFS-4824: -- Integrated in Hadoop-Yarn-trunk #212 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/212/]) HDFS-4824. FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner. Contributed by Colin Patrick McCabe. (Revision 1483641) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483641 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/FileInputStreamCache.java FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner - Key: HDFS-4824 URL: https://issues.apache.org/jira/browse/HDFS-4824 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.4-alpha Reporter: Henry Robinson Assignee: Colin Patrick McCabe Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4824.001.patch, HDFS-4824.002.patch {{FileInputStreamCache}} leaves around a reference to its {{cacheCleaner}} after {{close()}}. The {{cacheCleaner}} is created like this: {code} if (cacheCleaner == null) { cacheCleaner = new CacheCleaner(); executor.scheduleAtFixedRate(cacheCleaner, expiryTimeMs, expiryTimeMs, TimeUnit.MILLISECONDS); } {code} and supposedly removed like this: {code} if (cacheCleaner != null) { executor.remove(cacheCleaner); } {code} However, {{ScheduledThreadPoolExecutor.remove}} returns a success boolean which should be checked. And I _think_ from a quick read of that class that the return value of {{scheduleAtFixedRate}} should be used as the argument to {{remove}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4834) Add -exclude path to fsck
Gerardo Vázquez created HDFS-4834: - Summary: Add -exclude path to fsck Key: HDFS-4834 URL: https://issues.apache.org/jira/browse/HDFS-4834 Project: Hadoop HDFS Issue Type: Improvement Reporter: Gerardo Vázquez Priority: Minor fsck would fail if the current file being check is deleted. If you are loading and deleting loaded files quite often this would lead to many fsck attempts until you can do a complete check. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4477) Secondary namenode may retain old tokens
[ https://issues.apache.org/jira/browse/HDFS-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660679#comment-13660679 ] Hudson commented on HDFS-4477: -- Integrated in Hadoop-Hdfs-0.23-Build #610 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/610/]) HDFS-4477. Secondary namenode may retain old tokens. Contributed by Daryn Sharp. (Revision 1483513) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483513 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecurityTokenEditLog.java Secondary namenode may retain old tokens Key: HDFS-4477 URL: https://issues.apache.org/jira/browse/HDFS-4477 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Kihwal Lee Assignee: Daryn Sharp Priority: Critical Fix For: 3.0.0, 2.0.5-beta, 0.23.8 Attachments: HDFS-4477.branch-23.patch, HDFS-4477.patch, HDFS-4477.patch, HDFS-4477.patch, HDFS-4477.patch, HDFS-4477.patch Upon inspection of a fsimage created by a secondary namenode, we've discovered it contains very old tokens. These are probably the ones that were not explicitly canceled. It may be related to the optimization done to avoid loading fsimage from scratch every time checkpointing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4830) Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660693#comment-13660693 ] Hudson commented on HDFS-4830: -- Integrated in Hadoop-Hdfs-trunk #1401 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1401/]) HDFS-4830. Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml. Contributed by Aaron T. Myers. (Revision 1483603) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483603 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/AvailableSpaceVolumeChoosingPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/TestAvailableSpaceVolumeChoosingPolicy.java Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml -- Key: HDFS-4830 URL: https://issues.apache.org/jira/browse/HDFS-4830 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-beta Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.0.5-beta Attachments: HDFS-4830.patch, HDFS-4830.patch In hdfs-default.xml we have these two settings: {noformat} dfs.datanode.fsdataset.volume.choosing.balanced-space-threshold dfs.datanode.fsdataset.volume.choosing.balanced-space-preference-percent {noformat} But in fact they should be these, from DFSConfigKeys.java: {noformat} dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-percent {noformat} This won't actually affect any functionality, since default values are used in the code anyway, but makes the documentation generated from hdfs-default.xml inaccurate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4824) FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner
[ https://issues.apache.org/jira/browse/HDFS-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660695#comment-13660695 ] Hudson commented on HDFS-4824: -- Integrated in Hadoop-Hdfs-trunk #1401 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1401/]) HDFS-4824. FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner. Contributed by Colin Patrick McCabe. (Revision 1483641) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483641 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/FileInputStreamCache.java FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner - Key: HDFS-4824 URL: https://issues.apache.org/jira/browse/HDFS-4824 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.4-alpha Reporter: Henry Robinson Assignee: Colin Patrick McCabe Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4824.001.patch, HDFS-4824.002.patch {{FileInputStreamCache}} leaves around a reference to its {{cacheCleaner}} after {{close()}}. The {{cacheCleaner}} is created like this: {code} if (cacheCleaner == null) { cacheCleaner = new CacheCleaner(); executor.scheduleAtFixedRate(cacheCleaner, expiryTimeMs, expiryTimeMs, TimeUnit.MILLISECONDS); } {code} and supposedly removed like this: {code} if (cacheCleaner != null) { executor.remove(cacheCleaner); } {code} However, {{ScheduledThreadPoolExecutor.remove}} returns a success boolean which should be checked. And I _think_ from a quick read of that class that the return value of {{scheduleAtFixedRate}} should be used as the argument to {{remove}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4830) Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660713#comment-13660713 ] Hudson commented on HDFS-4830: -- Integrated in Hadoop-Mapreduce-trunk #1428 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1428/]) HDFS-4830. Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml. Contributed by Aaron T. Myers. (Revision 1483603) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483603 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/AvailableSpaceVolumeChoosingPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/TestAvailableSpaceVolumeChoosingPolicy.java Typo in config settings for AvailableSpaceVolumeChoosingPolicy in hdfs-default.xml -- Key: HDFS-4830 URL: https://issues.apache.org/jira/browse/HDFS-4830 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-beta Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.0.5-beta Attachments: HDFS-4830.patch, HDFS-4830.patch In hdfs-default.xml we have these two settings: {noformat} dfs.datanode.fsdataset.volume.choosing.balanced-space-threshold dfs.datanode.fsdataset.volume.choosing.balanced-space-preference-percent {noformat} But in fact they should be these, from DFSConfigKeys.java: {noformat} dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-percent {noformat} This won't actually affect any functionality, since default values are used in the code anyway, but makes the documentation generated from hdfs-default.xml inaccurate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4824) FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner
[ https://issues.apache.org/jira/browse/HDFS-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660715#comment-13660715 ] Hudson commented on HDFS-4824: -- Integrated in Hadoop-Mapreduce-trunk #1428 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1428/]) HDFS-4824. FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner. Contributed by Colin Patrick McCabe. (Revision 1483641) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483641 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/FileInputStreamCache.java FileInputStreamCache.close leaves dangling reference to FileInputStreamCache.cacheCleaner - Key: HDFS-4824 URL: https://issues.apache.org/jira/browse/HDFS-4824 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.4-alpha Reporter: Henry Robinson Assignee: Colin Patrick McCabe Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4824.001.patch, HDFS-4824.002.patch {{FileInputStreamCache}} leaves around a reference to its {{cacheCleaner}} after {{close()}}. The {{cacheCleaner}} is created like this: {code} if (cacheCleaner == null) { cacheCleaner = new CacheCleaner(); executor.scheduleAtFixedRate(cacheCleaner, expiryTimeMs, expiryTimeMs, TimeUnit.MILLISECONDS); } {code} and supposedly removed like this: {code} if (cacheCleaner != null) { executor.remove(cacheCleaner); } {code} However, {{ScheduledThreadPoolExecutor.remove}} returns a success boolean which should be checked. And I _think_ from a quick read of that class that the return value of {{scheduleAtFixedRate}} should be used as the argument to {{remove}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660724#comment-13660724 ] Hari Mankude commented on HDFS-4817: Colin, Can this feature be extended to determine where data needs to be stored in DN? For example, a DN might have SSDs and SATA/SAS drives and depending on hints provided by the user on the access patterns (random reads vs long sequential reads), it might be useful to put the data in SSDs vs SATA. I understand that NN has to be involved to make this information persistent during block relocation. The nice goal would be to make DN smarter (or have the ability to learn with minimal involvement from NN) than what it is doing right now given that nodes can have storage devices with vastly different characteristics. Another option is to use access patterns to move data across various storages in DN. [sort of HSM] It looks like current patch is mainly to manage the OS pagecache. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-4817.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4835) Port trunk WebHDFS changes to branch-0.23
Robert Parker created HDFS-4835: --- Summary: Port trunk WebHDFS changes to branch-0.23 Key: HDFS-4835 URL: https://issues.apache.org/jira/browse/HDFS-4835 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.7 Reporter: Robert Parker Assignee: Robert Parker Priority: Critical HADOOP-9549 and HDFS-4805 made changes to make the WebHDFS and DelegationTokenRenewer to make it more robust for secure clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4835) Port trunk WebHDFS changes to branch-0.23
[ https://issues.apache.org/jira/browse/HDFS-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated HDFS-4835: Target Version/s: 0.23.8 Port trunk WebHDFS changes to branch-0.23 -- Key: HDFS-4835 URL: https://issues.apache.org/jira/browse/HDFS-4835 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.7 Reporter: Robert Parker Assignee: Robert Parker Priority: Critical HADOOP-9549 and HDFS-4805 made changes to make the WebHDFS and DelegationTokenRenewer to make it more robust for secure clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4823) Inode.toString () should return the full path
[ https://issues.apache.org/jira/browse/HDFS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660834#comment-13660834 ] Benoy Antony commented on HDFS-4823: Thanks for looking into this , Suresh. The trunk already has the change to print the full path. I have ported this patch from trunk . #getFullPathName() is not public in trunk. So I maintained the same here. Inode.toString () should return the full path --- Key: HDFS-4823 URL: https://issues.apache.org/jira/browse/HDFS-4823 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.1.2 Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-4823.patch Indoe.ToString() is used in many error messages. This gives the name of the file / directory, but not the fullpath. org.apache.hadoop.security.AccessControlException org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=WRITE, inode=warehouse:user2:supergroup:rwxrwxr-x) The fix is to provide the full path n line with Hadoop 2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660842#comment-13660842 ] Colin Patrick McCabe commented on HDFS-4817: [~harip] You might want to check out https://issues.apache.org/jira/browse/HDFS-4672, where there has been some discussion of tiered storage policies. I think these are somewhat separate issues. A cache is transitory and doesn't affect where the data is stored; a storage policy is something permanent. I also anticipate storage policies being set by the administrator or the creator of the file, whereas this API is useful to programs opening files for read. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-4817.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4820) Remove hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660852#comment-13660852 ] Chris Nauroth commented on HDFS-4820: - Removing *-default.xml seems to complicate resolution of some of our more dynamic configuration properties. A good example is our mapping of file system impl classes by URI scheme used by {{AbstractFileSystem#createFileSystem}}: {code} property namefs.AbstractFileSystem.file.impl/name valueorg.apache.hadoop.fs.local.LocalFs/value descriptionThe AbstractFileSystem for file: uris./description /property property namefs.AbstractFileSystem.hdfs.impl/name valueorg.apache.hadoop.fs.Hdfs/value descriptionThe FileSystem for hdfs: uris./description /property {code} {code} public static AbstractFileSystem createFileSystem(URI uri, Configuration conf) throws UnsupportedFileSystemException { Class? clazz = conf.getClass(fs.AbstractFileSystem. + uri.getScheme() + .impl, null); if (clazz == null) { throw new UnsupportedFileSystemException( No AbstractFileSystem for scheme: + uri.getScheme()); } return (AbstractFileSystem) newInstance(clazz, uri, conf); } {code} Without defaults in the XML, this code will need to hard-code the mapping somewhere. We'll have to remember to cover all cases like this. {quote} ...it should not be part of the jar and should not be looked for and loaded in by default into the Configuration object. {quote} This may be a bigger concern for compatibility. {{Configuration}} is annotated public/stable, and I've seen a lot of tutorials with sample code that instantiates a new instance and expects it to be fully populated with the keys from *-default.xml. For full compatibility, I suppose we'd need to update not only our own {{Configuration#get}} calls to enforce the defaults, but also guarantee that if a client creates a new instance, they get the same values that used to be provided in the XML. Again, this probably would involve some kind of hard-coding during static initialization. Remove hdfs-default.xml --- Key: HDFS-4820 URL: https://issues.apache.org/jira/browse/HDFS-4820 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Similar to YARN-673, which contains additional details. There's separate jiras for YARN, MR and HDFS so enough people take a look. Looking for reasons for these files to exist, other than the ones mentioned in YARN-673, or a good reason to keep the files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660880#comment-13660880 ] Hari Mankude commented on HDFS-4817: I would look at the patch as an ability for the user to provide hints to DN regarding the access patterns (random reads/sequential read/write once only/multiple access etc). It is incidental that these hints are currently used to manage pagecache. The same hints or similar hints can be used for moving blocks to different storage tiers at DN. Another suggestion that I had is to provide a fadvise() like interface on the iostream that a user can use to send hints. I am aware of hfds-4672. It is a complicated and correct way of managing storage pools. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-4817.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660921#comment-13660921 ] Colin Patrick McCabe commented on HDFS-4817: That's a good idea. {{CachingPolicy}} could be extended in the future to have a lot of those features. It is sent over the wire using protobufs, so we could easily add more fields in the future. In order to make it more similar to the {{fadvise}} interface, maybe I should rename dropBehind to {{dontNeed}} (similar to {{FADV_DONTNEED}})? make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-4817.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660938#comment-13660938 ] Todd Lipcon commented on HDFS-4817: --- I think it's a good idea to make sure whatever API we come up with here can be extended later to provide other hints. But I wouldn't let the scope creep much on this JIRA which is fairly simple on its own (just allowing advanced clients to tune their IO a bit better on spinning disks). bq. In order to make it more similar to the fadvise interface, maybe I should rename dropBehind to dontNeed (similar to FADV_DONTNEED)? I think that's just confusing, since FADV_DONTNEED takes a file range, whereas what we're doing here is telling the DN to enact a more complicated policy (automatically DONTNEED everything after it gets read off disk). Maybe the best name would be DONT_KEEP_CACHE, since that's really what we're doing from the user perspective. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-4817.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4829) Strange loss of data displayed in hadoop fs -tail command
[ https://issues.apache.org/jira/browse/HDFS-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Grayson updated HDFS-4829: --- Summary: Strange loss of data displayed in hadoop fs -tail command (was: Strange loss of data displayed in hadoop fs -tail command when data is separated by periods?) Strange loss of data displayed in hadoop fs -tail command - Key: HDFS-4829 URL: https://issues.apache.org/jira/browse/HDFS-4829 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.0-alpha Environment: OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM running under windows 7) Testing on both 2.0.0-cdh4.1.1 and 2.0.0-cdh4.1.2 Reporter: Todd Grayson Priority: Minor Strange behavior of the hadoop fs -tail command - its default for output seems to be 9 lines of output vs 10 lines of output in the OS version of the command (minor issue). The strange thing (bug behavior?) appears to drop the initial octect from an IP address when examining a file over HDFS. [training@localhost hands-on]$ hadoop fs -tail weblog/access_log .190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 *When looking at the original log data outside of HDFS with the os version of the tail command we see the following* [training@localhost hands-on]$ hadoop fs -get weblog/access_log ./ [training@localhost hands-on]$ tail access_log 10.190.174.142 - - [03/Dec/2011:13:28:06 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 When using non ip data seperated by periods, it gets even worse and even more data is masked? (same data subtituting names for IP octects). Note we loose the first line well into the URI string? * [training@localhost hands-on]$ hadoop fs -tail weblog/test_log s/javascript_combined.js HTTP/1.1 200 20404 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 larry.379 larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] GET
[jira] [Commented] (HDFS-4805) Webhdfs client is fragile to token renewal errors
[ https://issues.apache.org/jira/browse/HDFS-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660967#comment-13660967 ] Kihwal Lee commented on HDFS-4805: -- The 0.23 patch will depend on HDFS-4835. Webhdfs client is fragile to token renewal errors - Key: HDFS-4805 URL: https://issues.apache.org/jira/browse/HDFS-4805 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-4805.patch Webhdfs internally acquires a token that will be used for DN-based operations. The token renewer in common will try to renew that token. If a renewal fails for any reason, it will try to get another token. If that fails, it gives up and the token webhdfs holds will soon expire. A transient network outage or a restart of the NN may cause webhdfs to be left holding an expired token, effectively rendering webhdfs useless. This is fatal for daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4829) Strange loss of data displayed in hadoop fs -tail command
[ https://issues.apache.org/jira/browse/HDFS-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660968#comment-13660968 ] Todd Grayson commented on HDFS-4829: In further testing, this is being seen in any data set being looked at with tail. It looks to be handling of escaping character sequences within the data being returned? Strange loss of data displayed in hadoop fs -tail command - Key: HDFS-4829 URL: https://issues.apache.org/jira/browse/HDFS-4829 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.0-alpha Environment: OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM running under windows 7) Testing on both 2.0.0-cdh4.1.1 and 2.0.0-cdh4.1.2 Reporter: Todd Grayson Priority: Minor Strange behavior of the hadoop fs -tail command - its default for output seems to be 9 lines of output vs 10 lines of output in the OS version of the command (minor issue). The strange thing (bug behavior?) appears to drop the initial octect from an IP address when examining a file over HDFS. [training@localhost hands-on]$ hadoop fs -tail weblog/access_log .190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 *When looking at the original log data outside of HDFS with the os version of the tail command we see the following* [training@localhost hands-on]$ hadoop fs -get weblog/access_log ./ [training@localhost hands-on]$ tail access_log 10.190.174.142 - - [03/Dec/2011:13:28:06 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 When using non ip data seperated by periods, it gets even worse and even more data is masked? (same data subtituting names for IP octects). Note we loose the first line well into the URI string? * [training@localhost hands-on]$ hadoop fs -tail weblog/test_log s/javascript_combined.js HTTP/1.1 200 20404 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 larry.379 larry.billy.will.amy - -
[jira] [Created] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
Jonathan Eagles created HDFS-4836: - Summary: Update Tomcat version for httpfs to 6.0.37 Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated HDFS-4836: -- Attachment: HDFS-4836.patch Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated HDFS-4836: -- Assignee: Jonathan Eagles Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Assignee: Jonathan Eagles Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated HDFS-4836: -- Status: Patch Available (was: Open) Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Assignee: Jonathan Eagles Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated HDFS-4836: -- Priority: Trivial (was: Major) Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin
[ https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660996#comment-13660996 ] Kihwal Lee commented on HDFS-4832: -- SBN also skips processing of over/under replicated blocks. The new condition in your patch will change SBN's behavior. There is another aspect of this issue. Since {{neededReplications}} is not scanned in safe mode and on SBN, orphaned blocks in there cause problems during {{metaSave()}}. They normally go away when ReplicationMonitor generates DN work, but since it doesn't happen while in these modes, those blocks can linger. When {{metaSave()}} hits one of these blocks, it dies with NPE because there is no corresponding {{INodeFile}}. Namenode doesn't change the number of missing blocks in safemode when DNs rejoin Key: HDFS-4832 URL: https://issues.apache.org/jira/browse/HDFS-4832 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta Reporter: Ravi Prakash Assignee: Ravi Prakash Priority: Critical Attachments: HDFS-4832.patch Courtesy Karri VRK Reddy! {quote} 1. Namenode lost datanodes causing missing blocks 2. Namenode was put in safe mode 3. Datanode restarted on dead nodes 4. Waited for lots of time for the NN UI to reflect the recovered blocks. 5. Forced NN out of safe mode and suddenly, no more missing blocks anymore. {quote} I was able to replicate this on 0.23 and trunk. I set dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate lost datanode. Without the NN updating this list of missing blocks, the grid admins will not know when to take the cluster out of safemode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660998#comment-13660998 ] Hadoop QA commented on HDFS-4836: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12583679/HDFS-4836.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs: org.apache.hadoop.fs.http.client.TestHttpFSWithHttpFSFileSystem org.apache.hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4412//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4412//console This message is automatically generated. Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin
[ https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661004#comment-13661004 ] Kihwal Lee commented on HDFS-4832: -- bq. Since neededReplications is not scanned in safe mode and on SBN ... This is true, but it is not a problem on SBN. SBN can have blocks from future, so it is natural to get reports on blocks that look like orphaned. Also it does not serve normal requests. The problem is when orphaned blocks are in {{neededReplications}} on an active node in safe mode. According to what we have seen in clusters, combination of forcing safe mode, deletions and DN restart can make it happen. Namenode doesn't change the number of missing blocks in safemode when DNs rejoin Key: HDFS-4832 URL: https://issues.apache.org/jira/browse/HDFS-4832 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta Reporter: Ravi Prakash Assignee: Ravi Prakash Priority: Critical Attachments: HDFS-4832.patch Courtesy Karri VRK Reddy! {quote} 1. Namenode lost datanodes causing missing blocks 2. Namenode was put in safe mode 3. Datanode restarted on dead nodes 4. Waited for lots of time for the NN UI to reflect the recovered blocks. 5. Forced NN out of safe mode and suddenly, no more missing blocks anymore. {quote} I was able to replicate this on 0.23 and trunk. I set dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate lost datanode. Without the NN updating this list of missing blocks, the grid admins will not know when to take the cluster out of safemode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4837) Allow DFSAdmin to run when HDFS is not the default file system
Mostafa Elhemali created HDFS-4837: -- Summary: Allow DFSAdmin to run when HDFS is not the default file system Key: HDFS-4837 URL: https://issues.apache.org/jira/browse/HDFS-4837 Project: Hadoop HDFS Issue Type: New Feature Reporter: Mostafa Elhemali Assignee: Mostafa Elhemali When Hadoop is running a different default file system than HDFS, but still have HDFS namenode running, we are unable to run dfsadmin commands. I suggest that DFSAdmin use the same mechanism as NameNode does today to get its address: look at dfs.namenode.rpc-address, and if not set fallback on getting it from the default file system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37
[ https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661040#comment-13661040 ] Jonathan Eagles commented on HDFS-4836: --- Test failures are due to ongoing issue described by HDFS-4825. Current tests are adequate to test new version of tomcat. Update Tomcat version for httpfs to 6.0.37 -- Key: HDFS-4836 URL: https://issues.apache.org/jira/browse/HDFS-4836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Attachments: HDFS-4836.patch Tomcat has release a new version of tomcat with security fixes http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3875) Issue handling checksum errors in write pipeline
[ https://issues.apache.org/jira/browse/HDFS-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661044#comment-13661044 ] Thomas Graves commented on HDFS-3875: - Suresh, Todd, Any comments on the latest patch? I am hoping to get this committed soon for 23.8 Issue handling checksum errors in write pipeline Key: HDFS-3875 URL: https://issues.apache.org/jira/browse/HDFS-3875 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Kihwal Lee Priority: Critical Attachments: hdfs-3875.branch-0.23.no.test.patch.txt, hdfs-3875.branch-0.23.patch.txt, hdfs-3875.branch-0.23.with.test.patch.txt, hdfs-3875.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875-wip.patch We saw this issue with one block in a large test cluster. The client is storing the data with replication level 2, and we saw the following: - the second node in the pipeline detects a checksum error on the data it received from the first node. We don't know if the client sent a bad checksum, or if it got corrupted between node 1 and node 2 in the pipeline. - this caused the second node to get kicked out of the pipeline, since it threw an exception. The pipeline started up again with only one replica (the first node in the pipeline) - this replica was later determined to be corrupt by the block scanner, and unrecoverable since it is the only replica -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3875) Issue handling checksum errors in write pipeline
[ https://issues.apache.org/jira/browse/HDFS-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661050#comment-13661050 ] Suresh Srinivas commented on HDFS-3875: --- Sorry, I have been meaning to look at this. But have not been able to spend time. Will review before the end of the day. Issue handling checksum errors in write pipeline Key: HDFS-3875 URL: https://issues.apache.org/jira/browse/HDFS-3875 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Kihwal Lee Priority: Critical Attachments: hdfs-3875.branch-0.23.no.test.patch.txt, hdfs-3875.branch-0.23.patch.txt, hdfs-3875.branch-0.23.with.test.patch.txt, hdfs-3875.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875-wip.patch We saw this issue with one block in a large test cluster. The client is storing the data with replication level 2, and we saw the following: - the second node in the pipeline detects a checksum error on the data it received from the first node. We don't know if the client sent a bad checksum, or if it got corrupted between node 1 and node 2 in the pipeline. - this caused the second node to get kicked out of the pipeline, since it threw an exception. The pipeline started up again with only one replica (the first node in the pipeline) - this replica was later determined to be corrupt by the block scanner, and unrecoverable since it is the only replica -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4835) Port trunk WebHDFS changes to branch-0.23
[ https://issues.apache.org/jira/browse/HDFS-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661058#comment-13661058 ] Chris Nauroth commented on HDFS-4835: - Hi, Robert. While you're porting, are you also interested in HDFS-3180? That one added connect timeouts and read timeouts to the sockets opened by {{WebHdfsFileSystem}}. Port trunk WebHDFS changes to branch-0.23 -- Key: HDFS-4835 URL: https://issues.apache.org/jira/browse/HDFS-4835 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.7 Reporter: Robert Parker Assignee: Robert Parker Priority: Critical HADOOP-9549 and HDFS-4805 made changes to make the WebHDFS and DelegationTokenRenewer to make it more robust for secure clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4829) Strange loss of data displayed in hadoop fs -tail command
[ https://issues.apache.org/jira/browse/HDFS-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661072#comment-13661072 ] Jing Zhao commented on HDFS-4829: - I think the reason of the behavior is that hadoop fs -tail only shows the last 1K data. Its description says Show the last 1KB of the file, and the shown content in the above two examples are both of exact 1K size. Strange loss of data displayed in hadoop fs -tail command - Key: HDFS-4829 URL: https://issues.apache.org/jira/browse/HDFS-4829 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.0.0-alpha Environment: OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM running under windows 7) Testing on both 2.0.0-cdh4.1.1 and 2.0.0-cdh4.1.2 Reporter: Todd Grayson Priority: Minor Strange behavior of the hadoop fs -tail command - its default for output seems to be 9 lines of output vs 10 lines of output in the OS version of the command (minor issue). The strange thing (bug behavior?) appears to drop the initial octect from an IP address when examining a file over HDFS. [training@localhost hands-on]$ hadoop fs -tail weblog/access_log .190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 *When looking at the original log data outside of HDFS with the os version of the tail command we see the following* [training@localhost hands-on]$ hadoop fs -get weblog/access_log ./ [training@localhost hands-on]$ tail access_log 10.190.174.142 - - [03/Dec/2011:13:28:06 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:08 -0800] GET /assets/js/javascript_combined.js HTTP/1.1 200 20404 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200 109379 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1 200 161657 When using non ip data seperated by periods, it gets even worse and even more data is masked? (same data subtituting names for IP octects). Note we loose the first line well into the URI string? * [training@localhost hands-on]$ hadoop fs -tail weblog/test_log s/javascript_combined.js HTTP/1.1 200 20404 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /assets/img/home-logo.png HTTP/1.1 200 3892 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/019.jpg HTTP/1.1 200 74446 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/g_still_04.jpg HTTP/1.1 200 761555 larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] GET /images/filmmediablock/360/07082218.jpg HTTP/1.1 200 154609 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmpics//2229/GOEMON-NUKI-000163.jpg HTTP/1.1 200 184976 larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] GET /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1 200 60117 larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] GET /images/filmmediablock/360/Chacha.jpg HTTP/1.1 200
[jira] [Moved] (HDFS-4838) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager
[ https://issues.apache.org/jira/browse/HDFS-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He moved YARN-697 to HDFS-4838: Issue Type: Improvement (was: Bug) Key: HDFS-4838 (was: YARN-697) Project: Hadoop HDFS (was: Hadoop YARN) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager Key: HDFS-4838 URL: https://issues.apache.org/jira/browse/HDFS-4838 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jian He Is it possible to move addPersistedDelegationToken in DelegationTokenSecretManager to AbstractDelegationTokenSecretManager? Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND logExpireToken to removeStoredToken for persisting and recovering keys/tokens? These methods are likely to be common methods and be used by overridden secretManager -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4838) HDFS should use the new methods added in HADOOP-9574
[ https://issues.apache.org/jira/browse/HDFS-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated HDFS-4838: -- Summary: HDFS should use the new methods added in HADOOP-9574 (was: Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager) HDFS should use the new methods added in HADOOP-9574 Key: HDFS-4838 URL: https://issues.apache.org/jira/browse/HDFS-4838 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jian He Is it possible to move addPersistedDelegationToken in DelegationTokenSecretManager to AbstractDelegationTokenSecretManager? Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND logExpireToken to removeStoredToken for persisting and recovering keys/tokens? These methods are likely to be common methods and be used by overridden secretManager -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4838) HDFS should use the new methods added in HADOOP-9574
[ https://issues.apache.org/jira/browse/HDFS-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated HDFS-4838: -- Description: HADOOP-9574 copies addPersistedDelegationToken in hdfs.DelegationTokenSecretManager to common.AbstractDelegationTokenSecretManager. HDFS code should be removed and should instead use the code in common. Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND logExpireToken to removeStoredToken for persisting and recovering keys/tokens? These methods are likely to be common methods and be used by overridden secretManager. was: Is it possible to move addPersistedDelegationToken in DelegationTokenSecretManager to AbstractDelegationTokenSecretManager? Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND logExpireToken to removeStoredToken for persisting and recovering keys/tokens? These methods are likely to be common methods and be used by overridden secretManager HDFS should use the new methods added in HADOOP-9574 Key: HDFS-4838 URL: https://issues.apache.org/jira/browse/HDFS-4838 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jian He HADOOP-9574 copies addPersistedDelegationToken in hdfs.DelegationTokenSecretManager to common.AbstractDelegationTokenSecretManager. HDFS code should be removed and should instead use the code in common. Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND logExpireToken to removeStoredToken for persisting and recovering keys/tokens? These methods are likely to be common methods and be used by overridden secretManager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4837) Allow DFSAdmin to run when HDFS is not the default file system
[ https://issues.apache.org/jira/browse/HDFS-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated HDFS-4837: --- Attachment: HDFS-4837.patch Attached a simple patch for trunk (to be honest, I haven't tested it out yet). Allow DFSAdmin to run when HDFS is not the default file system -- Key: HDFS-4837 URL: https://issues.apache.org/jira/browse/HDFS-4837 Project: Hadoop HDFS Issue Type: New Feature Reporter: Mostafa Elhemali Assignee: Mostafa Elhemali Attachments: HDFS-4837.patch When Hadoop is running a different default file system than HDFS, but still have HDFS namenode running, we are unable to run dfsadmin commands. I suggest that DFSAdmin use the same mechanism as NameNode does today to get its address: look at dfs.namenode.rpc-address, and if not set fallback on getting it from the default file system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3875) Issue handling checksum errors in write pipeline
[ https://issues.apache.org/jira/browse/HDFS-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661161#comment-13661161 ] Suresh Srinivas commented on HDFS-3875: --- [~kihwal] the new solutions looks much better. Nice work! Some minor comments. +1 with those addressed: # DFSOutputStream.java #* Initialize lastAckedSeqnoBeforeFailure to appropriate value. lastAckedSeqNo is initialized to -1. #* Change info log, print warn? Instead of Already tried 5 times - Already retried 5 times, given total attempts are 6 and retries are 5. # DFSClientFaultInjecto#uncorruptPacket() - does it need to throw IOException? Issue handling checksum errors in write pipeline Key: HDFS-3875 URL: https://issues.apache.org/jira/browse/HDFS-3875 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Kihwal Lee Priority: Critical Attachments: hdfs-3875.branch-0.23.no.test.patch.txt, hdfs-3875.branch-0.23.patch.txt, hdfs-3875.branch-0.23.with.test.patch.txt, hdfs-3875.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875-wip.patch We saw this issue with one block in a large test cluster. The client is storing the data with replication level 2, and we saw the following: - the second node in the pipeline detects a checksum error on the data it received from the first node. We don't know if the client sent a bad checksum, or if it got corrupted between node 1 and node 2 in the pipeline. - this caused the second node to get kicked out of the pipeline, since it threw an exception. The pipeline started up again with only one replica (the first node in the pipeline) - this replica was later determined to be corrupt by the block scanner, and unrecoverable since it is the only replica -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4837) Allow DFSAdmin to run when HDFS is not the default file system
[ https://issues.apache.org/jira/browse/HDFS-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated HDFS-4837: --- Status: Patch Available (was: Open) Allow DFSAdmin to run when HDFS is not the default file system -- Key: HDFS-4837 URL: https://issues.apache.org/jira/browse/HDFS-4837 Project: Hadoop HDFS Issue Type: New Feature Reporter: Mostafa Elhemali Assignee: Mostafa Elhemali Attachments: HDFS-4837.patch When Hadoop is running a different default file system than HDFS, but still have HDFS namenode running, we are unable to run dfsadmin commands. I suggest that DFSAdmin use the same mechanism as NameNode does today to get its address: look at dfs.namenode.rpc-address, and if not set fallback on getting it from the default file system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4837) Allow DFSAdmin to run when HDFS is not the default file system
[ https://issues.apache.org/jira/browse/HDFS-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661227#comment-13661227 ] Hadoop QA commented on HDFS-4837: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12583706/HDFS-4837.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4413//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4413//console This message is automatically generated. Allow DFSAdmin to run when HDFS is not the default file system -- Key: HDFS-4837 URL: https://issues.apache.org/jira/browse/HDFS-4837 Project: Hadoop HDFS Issue Type: New Feature Reporter: Mostafa Elhemali Assignee: Mostafa Elhemali Attachments: HDFS-4837.patch When Hadoop is running a different default file system than HDFS, but still have HDFS namenode running, we are unable to run dfsadmin commands. I suggest that DFSAdmin use the same mechanism as NameNode does today to get its address: look at dfs.namenode.rpc-address, and if not set fallback on getting it from the default file system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira