[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351449#comment-14351449 ] Chris Douglas commented on HDFS-7411: - Looked through the patch; it addresses the feedback. [~szetszwo], do you want to review the patch before commit? > Refactor and improve decommissioning logic into DecommissionManager > --- > > Key: HDFS-7411 > URL: https://issues.apache.org/jira/browse/HDFS-7411 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.5.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, > hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, > hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, > hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch > > > Would be nice to split out decommission logic from DatanodeManager to > DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7838: - Attachment: HDFS-7838.002.patch Thanks Colin for review, update the patch to address your comments. {quote} Also, can you add a stub hdfsTruncateFile function to libwebhdfs that returns ENOTSUP, and file a jira to add truncate support to libwebhdfs? or just implement it in this patch, your choice. {quote} I file HDFS-7902 for it, thanks. > Expose truncate API for libhdfs > --- > > Key: HDFS-7838 > URL: https://issues.apache.org/jira/browse/HDFS-7838 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 2.7.0 >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: 2.7.0 > > Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch > > > It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7902) Expose truncate API for libwebhdfs
Yi Liu created HDFS-7902: Summary: Expose truncate API for libwebhdfs Key: HDFS-7902 URL: https://issues.apache.org/jira/browse/HDFS-7902 Project: Hadoop HDFS Issue Type: Improvement Components: native, webhdfs Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu As Colin suggested in HDFS-7838, we will add truncate support for libwebhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)
[ https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7312: Labels: (was: reviewed) > Update DistCp v1 to optionally not use tmp location (branch-1 only) > --- > > Key: HDFS-7312 > URL: https://issues.apache.org/jira/browse/HDFS-7312 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 1.2.1 >Reporter: Joseph Prosser >Assignee: Joseph Prosser >Priority: Minor > Fix For: 1.3.0 > > Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, > HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, > HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > DistCp v1 currently copies files to a tmp location and then renames that to > the specified destination. This can cause performance issues on filesystems > such as S3. A -skiptmp flag will be added to bypass this step and copy > directly to the destination. This feature mirrors a similar one added to > HBase ExportSnapshot > [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9] > NOTE: This is a branch-1 change only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)
[ https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7312: Hadoop Flags: Reviewed > Update DistCp v1 to optionally not use tmp location (branch-1 only) > --- > > Key: HDFS-7312 > URL: https://issues.apache.org/jira/browse/HDFS-7312 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 1.2.1 >Reporter: Joseph Prosser >Assignee: Joseph Prosser >Priority: Minor > Labels: reviewed > Fix For: 1.3.0 > > Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, > HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, > HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > DistCp v1 currently copies files to a tmp location and then renames that to > the specified destination. This can cause performance issues on filesystems > such as S3. A -skiptmp flag will be added to bypass this step and copy > directly to the destination. This feature mirrors a similar one added to > HBase ExportSnapshot > [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9] > NOTE: This is a branch-1 change only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)
[ https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7312: Labels: reviewed (was: ) > Update DistCp v1 to optionally not use tmp location (branch-1 only) > --- > > Key: HDFS-7312 > URL: https://issues.apache.org/jira/browse/HDFS-7312 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 1.2.1 >Reporter: Joseph Prosser >Assignee: Joseph Prosser >Priority: Minor > Labels: reviewed > Fix For: 1.3.0 > > Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, > HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, > HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > DistCp v1 currently copies files to a tmp location and then renames that to > the specified destination. This can cause performance issues on filesystems > such as S3. A -skiptmp flag will be added to bypass this step and copy > directly to the destination. This feature mirrors a similar one added to > HBase ExportSnapshot > [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9] > NOTE: This is a branch-1 change only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351384#comment-14351384 ] stack commented on HDFS-7844: - Late to the game. Patch is great. Creative. All comments can be addressed later. High level, any notion of difference in perf when comparing native to offheap to current implementation? If we fail to pick up the configured memory manager (or the default), its worth a WARN log. Otherwise, folks may be confounded that they are getting the native memory manager though they asked for something else: 83 public static MemoryManager create(String name, Configuration conf) { 84String memoryManagerKey = conf.get( 85CommonConfigurationKeys.HADOOP_MEMORY_MANAGER_KEY, 86CommonConfigurationKeys.HADOOP_MEMORY_MANAGER_DEFAULT); 87if (memoryManagerKey == null) { 88 memoryManagerKey = NativeMemoryManager.class.getCanonicalName(); 89} This an arbitrary max? private final static long MAX_ADDRESS = 0x3fffL; ByteArrayMemoryManager is just throwaway, testing? Otherwise, protect Log.TRACE with LOG.isTraceEnabled... Its fun the way you did ByteArrayMemoryManager mapping address to a Map. Ok, I see, BAMM is just for testing. Ignore above. nit: make a method rather than dup the below...: 145 Entry entry = buffers.floorEntry(Long.valueOf(addr)); 146 if (entry == null) { 147 throw new RuntimeException("Wrote to unallocated address 0x" + 148 Long.toHexString(addr)); 149 } method would return a byte array gotten from TreeMap... etc. Is logging open at DEBUG but close at TRACE lead to confusion? Stumped debugger? The lose has to let out an IOE? What is the caller going to do w/ this IOE? The ByteArrayMemoryManager close error string construction is same as close on ProbingHashTable? Yeah man, put the Log.TRACE behind a test for TRACEyness. I buy the compactness invariant. I like the compromise put upon the Iterator (that resize is allowed while Iteration...) Seems appropriate given where this is to be deployed. On TestMemoryManager, maybe parameterize so once through with ByteArrayMemoryManager and then a run with the offheap implementation rather than have dedicated test for each: https://github.com/junit-team/junit/wiki/Parameterized-tests TestProbingHashTable is fun with its BlockInfo, etc., implementations. > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351346#comment-14351346 ] Zhe Zhang commented on HDFS-6450: - Thanks for the suggestion Colin. Hedged pread is already handling BlockReaderLocal (by wrapping it as a Future). I guess it's reasonable to do the same for non-positional read too? Is it correct to understand hedged vs. non-hedged and positional vs. non-positional as orthogonal dimensions? If so, the only new requirement in hedged non-positional read is to utilize and maintain the states (pos, blockReader). I'm still getting myself around this complex reader code. So please let me know if I missed something. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351319#comment-14351319 ] Allen Wittenauer commented on HDFS-5796: (And, just to make clear the impact, this issue prevents us from upgrading Hadoop.) > The file system browser in the namenode UI requires SPNEGO. > --- > > Key: HDFS-5796 > URL: https://issues.apache.org/jira/browse/HDFS-5796 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 >Reporter: Kihwal Lee >Assignee: Arun Suresh >Priority: Blocker > Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, > HDFS-5796.3.patch, HDFS-5796.3.patch > > > After HDFS-5382, the browser makes webhdfs REST calls directly, requiring > SPNEGO to work between user's browser and namenode. This won't work if the > cluster's security infrastructure is isolated from the regular network. > Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-7877: -- Attachment: HDFS-7877.patch Supportmaintenancestatefordatanodes.pdf Here are the initial design document and draft patch. Appreciate any input others might have. To support maintenance state, we need to provide admins interface, manage the datanode state transitions and handle block related operations. After we agree on the design, we can break the feature into subtasks. > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Ming Ma > Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351313#comment-14351313 ] Konstantin Shvachko commented on HDFS-7886: --- More things that we've been looking at with Plamen. 5. The race condition is in {{FsDatasetImpl.getBlockReports()}}, which collects the references to replicas under {{synchronizes}} section, but then constructs {{BlockListAsLongs}} outside of it. So if the recovery is triggered between them, then a replica can change its state. Here it changes from RUR to FINALIZED. 6. {{testTruncateWithDataNodesRestartImmediately()}} occasionally fails because block is recovered only on two DNs. This happens because NN does not know that two DNs were restarted and can schedule block recovery with a mixture of old (before the restart) and new (after the restart) locations. If the old location is used then recovery fails, because the DN have been restarted under a new address. {{waitActive()}} doesn't help here. We should somehow check that all new DNs have been registered and sent block reports. > TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes > > > Key: HDFS-7886 > URL: https://issues.apache.org/jira/browse/HDFS-7886 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.0 >Reporter: Yi Liu >Assignee: Plamen Jeliazkov >Priority: Minor > Attachments: HDFS-7886.patch > > > https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351307#comment-14351307 ] Allen Wittenauer edited comment on HDFS-5796 at 3/7/15 1:53 AM: bq. when security is enabled, WebHDFS by default picks up SPNEGO + KerberosAuthFilter. So the UI works, but only when the browser is launched after a kinit. If I don't do a kinit, I cannot browse files through the UI - this is the loss of functionality that is being discussed here? No. The key point in that summary is "by default". If you need something that isn't the default, the whole system falls apart. The fundamental problem is that if you use something like the AltKerberos filter, it flat out doesn't work. There two key problems we've noticed: a) filter parameters don't get passed down to either AltK's SPNEGO filter or a user's custom one b) after we did some custom hacking, we noticed that cookie secret handling is broken. Thus, using a browser to peruse HDFS with custom auth is completely broken in 2.6 and up due to the removal of the old UI. bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with PseudoAuthFilter, but then the UI as well as applications always thinks you are dr.who. So, I guess this is not acceptable? No. HDFS-5716 just flat doesn't work in practice due to the above issues. It isn't reflective of real world usage at all. (.. and, believe me, we've tried to make it work without completely rewriting the built-in AltKerberos filter.) There's a very high chance that HADOOP-10709 might actually fix our issues, but the person who was testing for me today went home ill. :( So hopefully we'll try to verify on Monday. bq. Dr. Who I think Arun was thinking we needed to provide a 'default alternative', but I think we've cleared up that isn't actually necessary. The 'default alternative' really is the AltKerberos filter that already ships with Hadoop. was (Author: aw): bq. when security is enabled, WebHDFS by default picks up SPNEGO + KerberosAuthFilter. So the UI works, but only when the browser is launched after a kinit. If I don't do a kinit, I cannot browse files through the UI - this is the loss of functionality that is being discussed here? No. The key point in that summary is "by default". If you need something that isn't the default, the whole system falls apart. The fundamental problem is that if you use something like the AltKerberos filter, it flat out doesn't work. There two key problems we've noticed: a) filter parameters don't get passed down to either AltK's SPNEGO filter or a user's custom one b) after we did some custom hacking, we noticed that cookie secret handling is broken. Thus, using a browser to peruse HDFS is completely broken in 2.6 and up due to the removal of the old UI. bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with PseudoAuthFilter, but then the UI as well as applications always thinks you are dr.who. So, I guess this is not acceptable? No. HDFS-5716 just flat doesn't work in practice due to the above issues. It isn't reflective of real world usage at all. (.. and, believe me, we've tried to make it work without completely rewriting the built-in AltKerberos filter.) There's a very high chance that HADOOP-10709 might actually fix our issues, but the person who was testing for me today went home ill. :( So hopefully we'll try to verify on Monday. > The file system browser in the namenode UI requires SPNEGO. > --- > > Key: HDFS-5796 > URL: https://issues.apache.org/jira/browse/HDFS-5796 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 >Reporter: Kihwal Lee >Assignee: Arun Suresh >Priority: Blocker > Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, > HDFS-5796.3.patch, HDFS-5796.3.patch > > > After HDFS-5382, the browser makes webhdfs REST calls directly, requiring > SPNEGO to work between user's browser and namenode. This won't work if the > cluster's security infrastructure is isolated from the regular network. > Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351307#comment-14351307 ] Allen Wittenauer commented on HDFS-5796: bq. when security is enabled, WebHDFS by default picks up SPNEGO + KerberosAuthFilter. So the UI works, but only when the browser is launched after a kinit. If I don't do a kinit, I cannot browse files through the UI - this is the loss of functionality that is being discussed here? No. The key point in that summary is "by default". If you need something that isn't the default, the whole system falls apart. The fundamental problem is that if you use something like the AltKerberos filter, it flat out doesn't work. There two key problems we've noticed: a) filter parameters don't get passed down to either AltK's SPNEGO filter or a user's custom one b) after we did some custom hacking, we noticed that cookie secret handling is broken. Thus, using a browser to peruse HDFS is completely broken in 2.6 and up due to the removal of the old UI. bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with PseudoAuthFilter, but then the UI as well as applications always thinks you are dr.who. So, I guess this is not acceptable? No. HDFS-5716 just flat doesn't work in practice due to the above issues. It isn't reflective of real world usage at all. (.. and, believe me, we've tried to make it work without completely rewriting the built-in AltKerberos filter.) There's a very high chance that HADOOP-10709 might actually fix our issues, but the person who was testing for me today went home ill. :( So hopefully we'll try to verify on Monday. > The file system browser in the namenode UI requires SPNEGO. > --- > > Key: HDFS-5796 > URL: https://issues.apache.org/jira/browse/HDFS-5796 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 >Reporter: Kihwal Lee >Assignee: Arun Suresh >Priority: Blocker > Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, > HDFS-5796.3.patch, HDFS-5796.3.patch > > > After HDFS-5382, the browser makes webhdfs REST calls directly, requiring > SPNEGO to work between user's browser and namenode. This won't work if the > cluster's security infrastructure is isolated from the regular network. > Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7854) Separate class DataStreamer out of DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351298#comment-14351298 ] Hadoop QA commented on HDFS-7854: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703032/HDFS-7854-002.patch against trunk revision 21101c0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1156 javac compiler warnings (more than the trunk's current 1155 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9785//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9785//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9785//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9785//console This message is automatically generated. > Separate class DataStreamer out of DFSOutputStream > -- > > Key: HDFS-7854 > URL: https://issues.apache.org/jira/browse/HDFS-7854 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-7854-001.patch, HDFS-7854-002.patch > > > This sub task separate DataStreamer from DFSOutputStream. New DataStreamer > will accept packets and write them to remote datanodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351290#comment-14351290 ] Colin Patrick McCabe commented on HDFS-7844: thanks, Charles. I think we can cover most of that stuff (maybe not the curAddrss > MAX_ADDRESS part, but the others...) in a follow-on. Good reviews by you and Yi. > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351282#comment-14351282 ] Charles Lamb commented on HDFS-7844: Thanks Colin, +1, I'll file a follow up jira for the coverage. > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351259#comment-14351259 ] Lei (Eddy) Xu commented on HDFS-7758: - [~jpallas] What I am going to do is something like the this: {code} public static interface ReferredFsVolumeList extends Iterable, Closeable { } /** * Returns a list of volume references. * * The caller must release the reference of each volume by calling * {@link FsVolumeReference#close}. */ public ReferredFsVolumeList getReferredVolumes(); {code} In this way, findbugs should be able to capture it? > Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead > - > > Key: HDFS-7758 > URL: https://issues.apache.org/jira/browse/HDFS-7758 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch > > > HDFS-7496 introduced reference-counting the volume instances being used to > prevent race condition when hot swapping a volume. > However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance > without increasing its reference count. In this JIRA, we retire the > {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} > and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer > of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351249#comment-14351249 ] Joe Pallas commented on HDFS-7758: -- [~cmccabe], is findbugs actually smart enough to figure out that an iterator is {{Closeable}} at call sites, which only see the {{Iterator}} interface? Or would you need to define a new interface that extends both {{Iterator}} and {{Closable}}? > Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead > - > > Key: HDFS-7758 > URL: https://issues.apache.org/jira/browse/HDFS-7758 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch > > > HDFS-7496 introduced reference-counting the volume instances being used to > prevent race condition when hot swapping a volume. > However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance > without increasing its reference count. In this JIRA, we retire the > {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} > and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer > of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7782) Read a striping layout file from client side
[ https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351243#comment-14351243 ] Zhe Zhang commented on HDFS-7782: - I realized that the current {{DFSInputStream}} non-positional read gets at most 64K at a time. Given that our striping cell size is 1M by default and is not likely to be configured < 64KB. It doesn't make much sense to maintain multiple blockReaders. > Read a striping layout file from client side > > > Key: HDFS-7782 > URL: https://issues.apache.org/jira/browse/HDFS-7782 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Zhe Zhang > Attachments: HDFS-7782-000.patch, HDFS-7782-001.patch > > > If client wants to read a file, he is not necessary to know and handle what > layout the file is. This sub task adds logic to DFSInputStream to support > reading striping layout files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351229#comment-14351229 ] Haohui Mai commented on HDFS-7816: -- {code} + throws IOException { +URI uri; +try { + uri = new URI(decoder.path()); +} catch (java.net.URISyntaxException e) { + throw new IOException("Invalid path:", e); +} {code} For GC reasons it might make more sense to (1) take {{QueryStringDecoder.decodeComponent()}} and make some tweaks based on it, and it is okay to throw {{IllegalArgumentException}} directly (which will be translated to 400). > Unable to open webhdfs paths with "+" > - > > Key: HDFS-7816 > URL: https://issues.apache.org/jira/browse/HDFS-7816 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Jason Lowe >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-7816.patch, HDFS-7816.patch > > > webhdfs requests to open files with % characters in the filename fail because > the filename is not being decoded properly. For example: > $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' > cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files
[ https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7853: Attachment: HDFS-7853.002.patch Update the patch: # simplify BlockInfoStripedUC by removing the index array # fix a bug in FSImageFormatPBINode when loading empty block array Currently the patch still use an index array to distinguish LocatedBlock and LocatedStripedBlock. My main concern about using sentinel entries in LocatedBlock is only with sentinel entries we may have to check the file's state to identify the type of the block. It may be better to infer this from the LocatedBlock itself. (but it should also be fine to use sentinel blocks + an extra field to achieve this) > Erasure coding: extend LocatedBlocks to support reading from striped files > -- > > Key: HDFS-7853 > URL: https://issues.apache.org/jira/browse/HDFS-7853 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Zhe Zhang >Assignee: Jing Zhao > Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch, > HDFS-7853.002.patch > > > We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work > with striping layout (possibly an extra list specifying the index of each > location in the group) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.
[ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351218#comment-14351218 ] Lei (Eddy) Xu commented on HDFS-7722: - [~cmccabe] Thanks for the review. I will make a patch to address your comments. [~cnauroth] Yes, you are right on this one. Sure, I believe we can hold committing this. A review from you early next week would be much appreciated! To add some background, the rationale of this patch is providing user a convenient way to fix bad disks without touching configuration files, in the meantime, also preserving disk failure information for reporting purpose. bq. I would like us to have some means to take corrective action and clear the volume failure information "online". For this concern, I suggest to have a following JIRA to let {{DataNode#parseChangedVolume}} to detect volumes that * is not in {{FsVolumeList}} * is not in {{DFS_DATANODE_DATA_DIR_KEYS}} * and is in {{volumeFailureInfos}} as {{DataNode#ChangedVolumes#deactiveLocations}}. So that the following logic can clear this failure info if the user _intents_ to do so. > DataNode#checkDiskError should also remove Storage when error is found. > --- > > Key: HDFS-7722 > URL: https://issues.apache.org/jira/browse/HDFS-7722 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, > HDFS-7722.002.patch > > > When {{DataNode#checkDiskError}} found disk errors, it removes all block > metadatas from {{FsDatasetImpl}}. However, it does not removed the > corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. > The result is that, we could not directly run {{reconfig}} to hot swap the > failure disks without changing the configure file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351208#comment-14351208 ] Haohui Mai commented on HDFS-7816: -- Discussed with [~vinodkv] offline. we can go back to the old behavior in branch-2 and fix it in trunk. Thoughts? > Unable to open webhdfs paths with "+" > - > > Key: HDFS-7816 > URL: https://issues.apache.org/jira/browse/HDFS-7816 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Jason Lowe >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-7816.patch, HDFS-7816.patch > > > webhdfs requests to open files with % characters in the filename fail because > the filename is not being decoded properly. For example: > $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' > cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351192#comment-14351192 ] Vinod Kumar Vavilapalli commented on HDFS-7816: --- I may be being completely naive here, but how about we add a parameter or something which specifies RFCCompatibility and turn that off by default for older clients? > Unable to open webhdfs paths with "+" > - > > Key: HDFS-7816 > URL: https://issues.apache.org/jira/browse/HDFS-7816 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Jason Lowe >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-7816.patch, HDFS-7816.patch > > > webhdfs requests to open files with % characters in the filename fail because > the filename is not being decoded properly. For example: > $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' > cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.
[ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351167#comment-14351167 ] Chris Nauroth commented on HDFS-7722: - [~eddyxu], sorry I haven't had a chance to dig into this patch yet. If I understand correctly, you're saying that removing a path from configuration and running reconfig will not clear volume failure information, but keeping the path in configuration, fixing the disk at that mount point and running reconfig will clear it. Do I have it right? I would like us to have some means to take corrective action and clear the volume failure information "online". As long as that's still possible in some way, then it's probably sticking to the spirit of the code I wrote earlier. Would you mind holding off the commit until early next week so I can take a closer look? Thanks! > DataNode#checkDiskError should also remove Storage when error is found. > --- > > Key: HDFS-7722 > URL: https://issues.apache.org/jira/browse/HDFS-7722 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, > HDFS-7722.002.patch > > > When {{DataNode#checkDiskError}} found disk errors, it removes all block > metadatas from {{FsDatasetImpl}}. However, it does not removed the > corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. > The result is that, we could not directly run {{reconfig}} to hot swap the > failure disks without changing the configure file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351163#comment-14351163 ] Charles Lamb commented on HDFS-7844: I applied your latest patch and set breakpoints at all of the exceptional throws in ByteArrayMemoryManager.java. Then I ran the unit test. The following lines did not trigger: 91, 94, 117, 129, 135, 165, 171, 190, 203, 245, 251. I think those are the exceptions in allocate, free, one of the ones in putShort, and all of the throws in the getters. > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351156#comment-14351156 ] Vinod Kumar Vavilapalli commented on HDFS-5796: --- Hey everyone, I've been trying to understand the problem here, but it is a big wall of text. It'll be great if someone can help me. It seems like # when security is enabled, WebHDFS by default picks up SPNEGO + KerberosAuthFilter. So the UI works, but only when the browser is launched after a kinit. If I don't do a kinit, I cannot browse files through the UI - this is the loss of functionality that is being discussed here? # with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with PseudoAuthFilter, but then the UI as well as applications always thinks you are dr.who. So, I guess this is not acceptable? # Is the patch trying to add (back) in a way to use KerberosAuthFilter for regular applications but use Dr.Who for browsers? And that is a security concern, so we don't want to put it back? Going back to the title, "The file system browser in the namenode UI requires SPNEGO.". Seems like with HDFS-5716, you can set your own filter and so the discussion is really about the defaults? Trying to gauge its priority for 2.7. Thanks. > The file system browser in the namenode UI requires SPNEGO. > --- > > Key: HDFS-5796 > URL: https://issues.apache.org/jira/browse/HDFS-5796 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 >Reporter: Kihwal Lee >Assignee: Arun Suresh >Priority: Blocker > Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, > HDFS-5796.3.patch, HDFS-5796.3.patch > > > After HDFS-5382, the browser makes webhdfs REST calls directly, requiring > SPNEGO to work between user's browser and namenode. This won't work if the > cluster's security infrastructure is isolated from the regular network. > Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.
[ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351152#comment-14351152 ] Colin Patrick McCabe commented on HDFS-7722: Eddy and I had an offline discussion about the use of {{Set}} here. It seems that there is a pervasive assumption elsewhere in the code that FsVolumeSpi instances are directories. For example, in these interface methods: {code} /** @return the base path to the volume */ public String getBasePath(); /** @return the path to the volume */ public String getPath(String bpid) throws IOException; /** @return the directory for the finalized blocks in the block pool. */ public File getFinalizedDir(String bpid) throws IOException; {code} So I think using {{Set}} is OK here for now, since it fits in with the rest of the code. We will probably have to revisit this later, but it seems outside the scope of this jira. One thing I really like about this patch is the fact we no longer hold the {{FsDatasetImpl}} mutex while scanning every volume. This alone is a very important improvement. I think it makes sense to leave the failure information around when removing volumes due to the disk checker. {code} 685 LOG.info("Deactivating volumes: " + 686 Joiner.on(",").join(absoluteVolumePaths)); {code} We should print out the value of {{clearFailure}} here. +1 once that's addressed. > DataNode#checkDiskError should also remove Storage when error is found. > --- > > Key: HDFS-7722 > URL: https://issues.apache.org/jira/browse/HDFS-7722 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, > HDFS-7722.002.patch > > > When {{DataNode#checkDiskError}} found disk errors, it removes all block > metadatas from {{FsDatasetImpl}}. However, it does not removed the > corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. > The result is that, we could not directly run {{reconfig}} to hot swap the > failure disks without changing the configure file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351138#comment-14351138 ] Hudson commented on HDFS-6488: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7277 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7277/]) HDFS-6488. Support HDFS superuser in NFSv3 gateway. Contributed by Brandon Li (brandonli: rev 0f8ecb1d0ce6d3ee9a7caf5b15b299210c2b8875) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/conf/NfsConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Fix For: 2.7.0 > > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351122#comment-14351122 ] Colin Patrick McCabe commented on HDFS-7844: bq. Several lines bust the 80 char limit. Thanks, I fixed a few cases. bq. What happens if someone runs this with the -d32 to the jvm? Do we need to make that check and throw accordingly? I don't think this would be a problem. We don't care about big java references are, since we're not using them. Even 32-bit machines should support {{Unsafe#getLong}}-- if necessary, through two 32-bit operations. bq. A small enhancement might be: close(boolean force) which will close unconditionally. I realize this is maybe a bit confusing, but we don't ever want to close with entries remaining in the hash table. The reason is because the caller is responsible for managing that memory. We wouldn't know what to do with it, so there would be a memory leak. Another thing to note is that in general, {{ProbingHashTable#close}} is really only a unit test thing. In real life, we would never actually close the BlocksMap in the NameNode... we'd just shut down the whole process when someone control-Cs. So we don't have to worry about how long close takes :) bq. The line in #getSlot which is hash = -hash is in fact tested by your unit tests, but I don't think it's tested by design in the test. You might want to put in an explicit test for that particular line. Believe me, without that line, the unit tests don't work. Ask me how I know. :) bq. expandTable: using catch(Throwable) feels like a rather wide net to cast, but I guess it's the right thing. I debated whether all you needed was catch (Error), but I guess you can't be sure that the callers above you won't just "keep going" after some RuntimeException gets into their hands. It's a little bit of paranoia on my part. Really, there should be no exceptions at all coming from that code, but given that this is Java, we can never actually guarantee that. Even methods that aren't declared to throw a particular exception can throw it, through the magic of classloaders. I wanted to be able to guarantee that there were no memory leaks, and this was the only way. And no, we can't rethrow the {{Throwable}} itself, because then Java complains that the function isn't declared to throw {{Throwable}}. So we wrap it in a {{RuntimeException}}. bq. The comment for #capacity() "total number of slots" is either misleading or wrong. Thanks. Let's just replace this with an accessor for {{numSlots}}. Now that the load factor is configurable, "capacity" is kind of a confusing term. bq. any reason not to have get/putShort along with the existing byte/int/long? Good idea, let's add that bq. Should #toString() be declared as... If you mean the toString function in the interfaces, everything in an interface is always public. And putting @Override there is not needed. The other places are already public and have @Override. bq. [MemoryManager] comments say nothing about whether it's thread safe or not. Ditto for ByteArrayMemoryManager. Let me add a comment to the base class JavaDoc. bq. There is no test coverage for the failure case of {{BAMM#close}} added bq. Why does curAddress start at 1000? It can start at any address other than 0. bq. For all of the put/get/byte/int/long routines, it wouldn't be hard to move all of the if() { throw new RuntimeException } snippits into their own routine. Maybe that's not worth the trouble, but if feels like there's a lot of repeated code. I think this is not worth the trouble. Maybe later. bq. The indentation of #testMemoryManagerCreate formals is messed up. ok bq. testCatchInvalidPuts: you test putByte against freed memory, but not int or long. yeah let's test them all bq. the Assert.fail messages should be different for each fail() call. It's fine. There is a line number. bq. I tried running TestMemoryManager.testNativeMemoryManagerCreate and it failed like this: This was bugged after I added the "name" argument to the MemoryManager. Should be fixed now. > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7844: --- Attachment: HDFS-7844-scl.003.patch > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, > HDFS-7844-scl.003.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351115#comment-14351115 ] Brandon Li edited comment on HDFS-6488 at 3/6/15 11:28 PM: --- Thank you, Stephen, Colin, Akira and Jing. I've updated the title and committed the patch. was (Author: brandonli): Thank you, Stephen, Colin and Jing. I've updated the title and committed the patch. > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Fix For: 2.7.0 > > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351118#comment-14351118 ] Eric Payne commented on HDFS-7818: -- Thank you [~wheat9] > OffsetParam should return the default value instead of throwing NPE when the > value is unspecified > - > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Fix For: 2.7.0 > > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6488: - Fix Version/s: 2.7.0 > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Fix For: 2.7.0 > > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6488: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351115#comment-14351115 ] Brandon Li commented on HDFS-6488: -- Thank you, Stephen, Colin and Jing. I've updated the title and committed the patch. > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6488: - Issue Type: New Feature (was: Bug) > HDFS superuser unable to access user's Trash files using NFSv3 mount > > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6488: - Summary: Support HDFS superuser in NFSv3 gateway (was: HDFS superuser unable to access user's Trash files using NFSv3 mount) > Support HDFS superuser in NFSv3 gateway > --- > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: New Feature > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351099#comment-14351099 ] Jing Zhao commented on HDFS-6488: - I think it should be ok to use a configuration prop to specify the nfs super user. The latest patch looks good to me. +1. > HDFS superuser unable to access user's Trash files using NFSv3 mount > > > Key: HDFS-6488 > URL: https://issues.apache.org/jira/browse/HDFS-6488 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Stephen Chu >Assignee: Brandon Li > Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, > HDFS-6488.003.patch > > > As hdfs superuseruser on the NFS mount, I cannot cd or ls the > /user/schu/.Trash directory: > {code} > bash-4.1$ cd .Trash/ > bash: cd: .Trash/: Permission denied > bash-4.1$ ls -la > total 2 > drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . > drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. > drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash > drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt > bash-4.1$ ls .Trash > ls: cannot open directory .Trash: Permission denied > bash-4.1$ > {code} > When using FsShell as hdfs superuser, I have superuser permissions to schu's > .Trash contents: > {code} > bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user > drwx-- - schu supergroup 0 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu > -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 > /user/schu/.Trash/Current/user/schu/tf1 > {code} > The NFSv3 logs don't produce any error when superuser tries to access schu > Trash contents. However, for other permission errors (e.g. schu tries to > delete a directory owned by hdfs), there will be a permission error in the > logs. > I think this is not specific to the .Trash directory perhaps. > I created a /user/schu/dir1 which has the same permissions as .Trash (700). > When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, > I get the same permission denied. > {code} > [schu@hdfs-nfs ~]$ hdfs dfs -ls > Found 4 items > drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash > drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 > -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 > drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt > bash-4.1$ whoami > hdfs > bash-4.1$ pwd > /hdfs_nfs_mount/user/schu > bash-4.1$ cd dir1 > bash: cd: dir1: Permission denied > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7893) Update the POM to create a separate hdfs-client jar
[ https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351094#comment-14351094 ] Hadoop QA commented on HDFS-7893: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703160/HDFS-7893.001.patch against trunk revision 27e8ea8. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9786//console This message is automatically generated. > Update the POM to create a separate hdfs-client jar > --- > > Key: HDFS-7893 > URL: https://issues.apache.org/jira/browse/HDFS-7893 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7893.000.patch, HDFS-7893.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7893) Update the POM to create a separate hdfs-client jar
[ https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7893: - Attachment: HDFS-7893.001.patch > Update the POM to create a separate hdfs-client jar > --- > > Key: HDFS-7893 > URL: https://issues.apache.org/jira/browse/HDFS-7893 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7893.000.patch, HDFS-7893.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351073#comment-14351073 ] Charles Lamb commented on HDFS-7844: [~cmccabe], This is a nice piece of work! Here are some comments: General: Several lines bust the 80 char limit. Many unused imports throughout. I guess Yi got this already. What happens if someone runs this with the -d32 to the jvm? Do we need to make that check and throw accordingly? ProbingHashSet.java: A small enhancement might be: {code}close(boolean force){code} which will close unconditionally. The line in #getSlot which is {code}hash = -hash{code} is in fact tested by your unit tests, but I don't think it's tested by design in the test. You might want to put in an explicit test for that particular line. #expandTable: using {code}catch(Throwable){code} feels like a rather wide net to cast, but I guess it's the right thing. I debated whether all you needed was catch (Error), but I guess you can't be sure that the callers above you won't just "keep going" after some RuntimeException gets into their hands. The comment for #capacity() "total number of slots" is either misleading or wrong. MemoryManager.java any reason not to have get/putShort along with the existing byte/int/long? Should #toString() be declared as {code}@Override public String toString(){code} NativeMemoryManager.java The comments say nothing about whether it's thread safe or not. Ditto for ByteArrayMemoryManager. ByteArrayMemoryManager There is no test coverage for the failure case of {code}BAMM.close(){code} s/valiation/validation/ (Yi caught this) Why does curAddress start at 1000? s/2^^31/2^31/ For all of the put/get/byte/int/long routines, it wouldn't be hard to move all of the {code}if() { throw new RuntimeException }{code} snippits into their own routine. Maybe that's not worth the trouble, but if feels like there's a lot of repeated code. TestMemoryManager.java The indentation of #testMemoryManagerCreate formals is messed up. #testCatchInvalidPuts: you test putByte against freed memory, but not int or long. the Assert.fail messages should be different for each fail() call. The exception checks in getByte/Int/Long are not tested. None of the entry==null exceptions are tested in putByte/Long/Int I tried running TestMemoryManager.testNativeMemoryManagerCreate and it failed like this: {code} 2015-03-06 17:10:22,430 ERROR offheap.MemoryManager$Factory (MemoryManager.java:create(91)) - Unable to create org.apache.hadoop.util.offheap.NativeMemoryManager. Falling back on org.apache.hadoop.util.offheap.ByteArrayMemoryManager java.lang.IllegalArgumentException: wrong number of arguments at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.util.offheap.MemoryManager$Factory.create(MemoryManager.java:89) at org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:135) at org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) org.junit.ComparisonFailure: Expected :org.apache.hadoop.util.offheap.NativeMemoryManager Actual :org.apache.hadoop.util.offheap.ByteArrayMemoryManager at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:137) at org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.De
[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion
[ https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351061#comment-14351061 ] Jing Zhao commented on HDFS-7857: - +1. Thanks for the improvement, [~yzhangal]! > Incomplete information in WARN message caused user confusion > > > Key: HDFS-7857 > URL: https://issues.apache.org/jira/browse/HDFS-7857 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: supportability > Attachments: HDFS-7857.001.patch > > > Lots of the following messages appeared in NN log: > {quote} > 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: > Auth failed for :39838:null (DIGEST-MD5: IO error acquiring > password) > 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > .. > SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for > :39843:null (DIGEST-MD5: IO error acquiring password) > 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > {quote} > The real reason of failure is the second message about StandbyException, > However, the first message is confusing because it talks about "DIGEST-MD5: > IO error acquiring password". > Filing this jira to modify the first message to have more comprehensive > information that can be obtained from {{getCauseForInvalidToken(e)}}. > {code} >try { > saslResponse = processSaslMessage(saslMessage); > } catch (IOException e) { > rpcMetrics.incrAuthenticationFailures(); > // attempting user could be null > AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" > + attemptingUser + " (" + e.getLocalizedMessage() + ")"); > throw (IOException) getCauseForInvalidToken(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351058#comment-14351058 ] Colin Patrick McCabe commented on HDFS-7758: {code} /** * Returns a list of volume references. * * The caller must release the reference of each volume by calling * {@link FsVolumeReference#close}. */ public List getVolumeRefs(); /** Returns a reference of a given volume, specified by the index. */ public FsVolumeReference getVolumeRef(int idx) throws IOException; {code} This is still the wrong interface. {{getVolumeRef(int)}} encourages people to assume that the number of volumes is never going to change. What happens if it does? Instead of doing this, let's have an {{Iterator}} that we can use. Something like this: {code} public Iterator getVolumeRefIterator(); private static class FsVolumeRefIterator implements Iterator, Closeable { private final List list; private int idx = 0; FsVolumeRefIterator(List spiList) { this.list = new ArrayList(); for (FsVolumeSpi volume : spiList) { try { this.list.add(volume.obtainReference()); } catch (ClosedChannelException e) { LOG.info("Can't obtain a reference to {} because it is closed.", volume.getBasePath()); } } } @Override public boolean hasNext() { return (idx < list.size()); } @Override public FsVolumeRef next() { int i = idx++; return list.get(i); } @Override public void remove() { throw UnsupportedOperationException(); } @Override public void close() throws IOException { for (FsVolumeRef ref : list) { ref.close(); } list.clear(); } } {code} Then we can get rid of {{getVolumeRefs}} and {{getVolumeRef}}. Since the {{Iterator}} implements {{java.io.Closeable}}, findbugs will remind us that we need to close it (and free the refs) in any function we use it in. > Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead > - > > Key: HDFS-7758 > URL: https://issues.apache.org/jira/browse/HDFS-7758 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch > > > HDFS-7496 introduced reference-counting the volume instances being used to > prevent race condition when hot swapping a volume. > However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance > without increasing its reference count. In this JIRA, we retire the > {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} > and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer > of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion
[ https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351047#comment-14351047 ] Hadoop QA commented on HDFS-7857: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703135/HDFS-7857.001.patch against trunk revision d1abc5d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9784//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9784//console This message is automatically generated. > Incomplete information in WARN message caused user confusion > > > Key: HDFS-7857 > URL: https://issues.apache.org/jira/browse/HDFS-7857 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: supportability > Attachments: HDFS-7857.001.patch > > > Lots of the following messages appeared in NN log: > {quote} > 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: > Auth failed for :39838:null (DIGEST-MD5: IO error acquiring > password) > 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > .. > SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for > :39843:null (DIGEST-MD5: IO error acquiring password) > 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > {quote} > The real reason of failure is the second message about StandbyException, > However, the first message is confusing because it talks about "DIGEST-MD5: > IO error acquiring password". > Filing this jira to modify the first message to have more comprehensive > information that can be obtained from {{getCauseForInvalidToken(e)}}. > {code} >try { > saslResponse = processSaslMessage(saslMessage); > } catch (IOException e) { > rpcMetrics.incrAuthenticationFailures(); > // attempting user could be null > AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" > + attemptingUser + " (" + e.getLocalizedMessage() + ")"); > throw (IOException) getCauseForInvalidToken(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351045#comment-14351045 ] Hudson commented on HDFS-7818: -- FAILURE: Integrated in Hadoop-trunk-Commit #7275 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7275/]) HDFS-7818. OffsetParam should return the default value instead of throwing NPE when the value is unspecified. Contributed by Eric Payne. (wheat9: rev c79710302ee51e1a9ee17dadb161c69bb3aba5c9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestParameterParser.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/OffsetParam.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > OffsetParam should return the default value instead of throwing NPE when the > value is unspecified > - > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Fix For: 2.7.0 > > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7818: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~eepayne] for reporting and fixing the issue. > OffsetParam should return the default value instead of throwing NPE when the > value is unspecified > - > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Fix For: 2.7.0 > > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7818: - Summary: OffsetParam should return the default value instead of throwing NPE when the value is unspecified (was: DataNode throws NPE if the WebHdfs URL does not contain the offset parameter) > OffsetParam should return the default value instead of throwing NPE when the > value is unspecified > - > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351024#comment-14351024 ] Haohui Mai commented on HDFS-7818: -- +1. I'm committing this. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files
[ https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351020#comment-14351020 ] Zhe Zhang commented on HDFS-7853: - Thanks for the fix Jing! The PoC test now works stably. > Erasure coding: extend LocatedBlocks to support reading from striped files > -- > > Key: HDFS-7853 > URL: https://issues.apache.org/jira/browse/HDFS-7853 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Zhe Zhang >Assignee: Jing Zhao > Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch > > > We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work > with striping layout (possibly an extra list specifying the index of each > location in the group) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7285: Attachment: HDFS-7285-initial-PoC.patch This is the patch from trunk that was used in the PoC test. It demonstrates the changes we have made to support basic I/O in striping layout. > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, > HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, > HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, > fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351008#comment-14351008 ] Hadoop QA commented on HDFS-7818: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703065/HDFS-7818.v5.txt against trunk revision 95bfd08. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFileTruncate The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9781//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9781//console This message is automatically generated. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350995#comment-14350995 ] Haohui Mai commented on HDFS-6200: -- Here is the list of dependency when I run {{mvn dependency:tree}} in {{hadoop-hdfs}}: {noformat} $ mvn dependency:tree|grep -v ":test" ... [INFO] --- maven-dependency-plugin:2.2:tree (default-cli) @ hadoop-hdfs --- [INFO] org.apache.hadoop:hadoop-hdfs:jar:3.0.0-SNAPSHOT [INFO] +- org.apache.hadoop:hadoop-annotations:jar:3.0.0-SNAPSHOT:provided [INFO] | \- jdk.tools:jdk.tools:jar:1.8:system [INFO] +- org.apache.hadoop:hadoop-auth:jar:3.0.0-SNAPSHOT:provided [INFO] | +- org.slf4j:slf4j-api:jar:1.7.10:provided [INFO] | +- org.apache.httpcomponents:httpclient:jar:4.2.5:provided [INFO] | | \- org.apache.httpcomponents:httpcore:jar:4.2.5:provided (version managed from 4.2.4) [INFO] | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided [INFO] | | +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:provided [INFO] | | +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:provided [INFO] | | \- org.apache.directory.api:api-util:jar:1.0.0-M20:provided [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided [INFO] | \- org.apache.curator:curator-framework:jar:2.7.1:provided [INFO] +- org.apache.hadoop:hadoop-common:jar:3.0.0-SNAPSHOT:provided [INFO] | +- org.apache.commons:commons-math3:jar:3.1.1:provided [INFO] | +- commons-httpclient:commons-httpclient:jar:3.1:provided [INFO] | +- commons-net:commons-net:jar:3.1:provided [INFO] | +- commons-collections:commons-collections:jar:3.2.1:provided [INFO] | +- javax.servlet.jsp:jsp-api:jar:2.1:provided [INFO] | +- com.sun.jersey:jersey-json:jar:1.9:provided [INFO] | | +- org.codehaus.jettison:jettison:jar:1.1:provided [INFO] | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided [INFO] | | | \- javax.xml.bind:jaxb-api:jar:2.2.2:provided [INFO] | | | +- javax.xml.stream:stax-api:jar:1.0-2:provided [INFO] | | | \- javax.activation:activation:jar:1.1:provided [INFO] | | +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:provided (version managed from 1.8.3) [INFO] | | \- org.codehaus.jackson:jackson-xc:jar:1.9.13:provided (version managed from 1.8.3) [INFO] | +- net.java.dev.jets3t:jets3t:jar:0.9.0:provided [INFO] | | \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:provided [INFO] | +- commons-configuration:commons-configuration:jar:1.6:provided [INFO] | | +- commons-digester:commons-digester:jar:1.8:provided [INFO] | | | \- commons-beanutils:commons-beanutils:jar:1.7.0:provided [INFO] | | \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided [INFO] | +- org.apache.avro:avro:jar:1.7.4:provided [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.0.4.1:provided [INFO] | +- com.google.code.gson:gson:jar:2.2.4:provided [INFO] | +- com.jcraft:jsch:jar:0.1.42:provided [INFO] | +- org.apache.curator:curator-client:jar:2.7.1:provided [INFO] | +- org.apache.curator:curator-recipes:jar:2.7.1:provided [INFO] | \- org.apache.commons:commons-compress:jar:1.4.1:provided [INFO] | \- org.tukaani:xz:jar:1.0:provided [INFO] +- com.google.guava:guava:jar:11.0.2:compile [INFO] | \- com.google.code.findbugs:jsr305:jar:3.0.0:compile [INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile [INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile [INFO] | \- asm:asm:jar:3.2:compile (version managed from 3.1) [INFO] +- commons-cli:commons-cli:jar:1.2:compile [INFO] +- commons-codec:commons-codec:jar:1.4:compile [INFO] +- commons-io:commons-io:jar:2.4:compile [INFO] +- commons-lang:commons-lang:jar:2.6:compile [INFO] +- commons-logging:commons-logging:jar:1.1.3:compile [INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile [INFO] +- log4j:log4j:jar:1.2.17:compile [INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile [INFO] +- javax.servlet:servlet-api:jar:2.5:compile [INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided [INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] +- xmlenc:xmlenc:jar:0.52:compile [INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile [INFO] +- xerces:xercesImpl:jar:2.9.1:compile [INFO] | \- xml-apis:xml-apis:jar:1.3.04:compile [INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile [INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile {noformat} As I mentioned earlier I plan to keep the dependency of {{hadoop-common}} / {{hadoop-auth}} for the first phase, which would allow us to get rid of the following dependency in the client jar: {noformat} [INFO] +- com.google.guava:guava:jar:11.0.2:compile [INFO] | \- com.googl
[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7844: --- Attachment: HDFS-7844-scl.002.patch > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6695) Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to implement read timeouts
[ https://issues.apache.org/jira/browse/HDFS-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350939#comment-14350939 ] Colin Patrick McCabe commented on HDFS-6695: And sending an {{Interrupt}} to a thread that is reading using blocking I/O is "faked" by closing the FD. > Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to > implement read timeouts > -- > > Key: HDFS-6695 > URL: https://issues.apache.org/jira/browse/HDFS-6695 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Colin Patrick McCabe > > In BlockReaderLocal, the "read" system call could block for a long time if > the disk drive is having problems, or there is a huge amount of I/O > contention. This might cause poor latency performance. > In the remote block readers, we have implemented a read timeout, but we don't > have one for the local block reader, since {{FileChannel#read}} doesn't > support this. > Once we move to JDK 7, we should investigate the {{java.nio.file}} > nonblocking file I/O package to see if it could be used to implement read > timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350936#comment-14350936 ] Colin Patrick McCabe commented on HDFS-6658: [~clamb] and I have been discussing how to do block reports without backreferences. If you have a 64-bit epoch number per datanode, you can bump that on each FBR. Then, you can simply ignore block entries that are too old when you are accessing them. In that case, you don't need to remove all stale blocks during an FBR. The downside of this approach is that the memory for the old entries will linger for a while longer than it would have otherwise. But if the memory consumption per entry is lower, it's probably still a win. It's pretty rare for a large number of blocks to go away without being mentioned in incremental block reports (IBRs). In the case where all IBRs are being received normally, of course, you have no additional memory overhead at all since you delete entries as soon as you get the incremental block removal notification. And of course with the epoch-based approach, you avoid updating the three linked list entries each time you touch a block in the FBR. This should give much better cache locality (the linked list has basically no cache locality at all... we're hammering main memory pretty much all the time right now). This would probably be coupled with some kind of background scanner thread that removed stale blockinfo instances from the hash table. > Namenode memory optimization - Block replicas list > --- > > Key: HDFS-6658 > URL: https://issues.apache.org/jira/browse/HDFS-6658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Daryn Sharp > Attachments: BlockListOptimizationComparison.xlsx, BlocksMap > redesign.pdf, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas > list.docx > > > Part of the memory consumed by every BlockInfo object in the Namenode is a > linked list of block references for every DatanodeStorageInfo (called > "triplets"). > We propose to change the way we store the list in memory. > Using primitive integer indexes instead of object references will reduce the > memory needed for every block replica (when compressed oops is disabled) and > in our new design the list overhead will be per DatanodeStorageInfo and not > per block replica. > see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion
[ https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350928#comment-14350928 ] Yongjun Zhang commented on HDFS-7857: - Hi [~jingzhao], I submitted patch 001, would you please help taking a look when convenient? thanks a lot. > Incomplete information in WARN message caused user confusion > > > Key: HDFS-7857 > URL: https://issues.apache.org/jira/browse/HDFS-7857 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: supportability > Attachments: HDFS-7857.001.patch > > > Lots of the following messages appeared in NN log: > {quote} > 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: > Auth failed for :39838:null (DIGEST-MD5: IO error acquiring > password) > 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > .. > SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for > :39843:null (DIGEST-MD5: IO error acquiring password) > 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > {quote} > The real reason of failure is the second message about StandbyException, > However, the first message is confusing because it talks about "DIGEST-MD5: > IO error acquiring password". > Filing this jira to modify the first message to have more comprehensive > information that can be obtained from {{getCauseForInvalidToken(e)}}. > {code} >try { > saslResponse = processSaslMessage(saslMessage); > } catch (IOException e) { > rpcMetrics.incrAuthenticationFailures(); > // attempting user could be null > AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" > + attemptingUser + " (" + e.getLocalizedMessage() + ")"); > throw (IOException) getCauseForInvalidToken(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7857) Incomplete information in WARN message caused user confusion
[ https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7857: Status: Patch Available (was: Open) > Incomplete information in WARN message caused user confusion > > > Key: HDFS-7857 > URL: https://issues.apache.org/jira/browse/HDFS-7857 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: supportability > Attachments: HDFS-7857.001.patch > > > Lots of the following messages appeared in NN log: > {quote} > 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: > Auth failed for :39838:null (DIGEST-MD5: IO error acquiring > password) > 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > .. > SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for > :39843:null (DIGEST-MD5: IO error acquiring password) > 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > {quote} > The real reason of failure is the second message about StandbyException, > However, the first message is confusing because it talks about "DIGEST-MD5: > IO error acquiring password". > Filing this jira to modify the first message to have more comprehensive > information that can be obtained from {{getCauseForInvalidToken(e)}}. > {code} >try { > saslResponse = processSaslMessage(saslMessage); > } catch (IOException e) { > rpcMetrics.incrAuthenticationFailures(); > // attempting user could be null > AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" > + attemptingUser + " (" + e.getLocalizedMessage() + ")"); > throw (IOException) getCauseForInvalidToken(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7857) Incomplete information in WARN message caused user confusion
[ https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7857: Attachment: HDFS-7857.001.patch > Incomplete information in WARN message caused user confusion > > > Key: HDFS-7857 > URL: https://issues.apache.org/jira/browse/HDFS-7857 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: supportability > Attachments: HDFS-7857.001.patch > > > Lots of the following messages appeared in NN log: > {quote} > 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: > Auth failed for :39838:null (DIGEST-MD5: IO error acquiring > password) > 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > .. > SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for > :39843:null (DIGEST-MD5: IO error acquiring password) > 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8020: readAndProcess from client threw exception > [org.apache.hadoop.ipc.StandbyException: Operation category READ is not > supported in state standby] > {quote} > The real reason of failure is the second message about StandbyException, > However, the first message is confusing because it talks about "DIGEST-MD5: > IO error acquiring password". > Filing this jira to modify the first message to have more comprehensive > information that can be obtained from {{getCauseForInvalidToken(e)}}. > {code} >try { > saslResponse = processSaslMessage(saslMessage); > } catch (IOException e) { > rpcMetrics.incrAuthenticationFailures(); > // attempting user could be null > AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" > + attemptingUser + " (" + e.getLocalizedMessage() + ")"); > throw (IOException) getCauseForInvalidToken(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350878#comment-14350878 ] Colin Patrick McCabe commented on HDFS-7836: I don't think the block map size is really that easy to get at right now. Taking a heap dump on a big NN can take minutes... it's not something most sysadmins will let you do. And the analysis is difficult... a lot of common heap analysis tools require tons of memory. Anyway, we should probably add a JMX counter for the size(s) of the block map hash tables, and the number of entries, for tracking purposes. > BlockManager Scalability Improvements > - > > Key: HDFS-7836 > URL: https://issues.apache.org/jira/browse/HDFS-7836 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Charles Lamb >Assignee: Charles Lamb > Attachments: BlockManagerScalabilityImprovementsDesign.pdf > > > Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350869#comment-14350869 ] Colin Patrick McCabe commented on HDFS-6450: I think it would be possible to support hedged non-positional reads in {{BlockReaderLocal}}, but difficult. First we would have to stop re-using the same FD for all instances of a BlockReaderLocal that were reading the same replica. Perhaps we could use dup to create a new FD per blockreader without doing multiple opens. Then we could close the blockreader FD if the local read were being slow. I think it's much easier to just implement hedged non-positional reads in the erasure coding-specific subclass of DFSInputStream. I also think we may want to create a base class for DFSInputStream that both the raid and the non-raid code path inherit from. Inheriting from the non-raid code path is weird because there is a lot of stuff that is not relevant. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350828#comment-14350828 ] Colin Patrick McCabe commented on HDFS-7844: bq. Open addressing (probing) is used for Hash table in the patch. Now the load factor is 0.5, not sure whether we can still get similar performance if we choose a little bigger value (say 0.7, certainly it should be less than 1). Bigger loadfactor will increase the collision, but will save lots of memory. I don't have the performance data, but I see other hash table implementations using open addressing choose 0.7 ~ 0.75 as loadfactor. According to wikipedia (see http://en.wikipedia.org/wiki/Hash_table ), "open addressing" is where you store the records in the hash table itself. This is different than probing, which is when you don't have a linked list per slot, but simply find another slot when collisions occur. Open addressing does require probing, but probing does not require open addressing. This hash table uses probing, but open addressing is optional. In the test code I wrote, open addressing is not used (entries are not stored in the hash table itself... only pointers to entries are stored). Partly this is because I can make the hash table larger that way. In the block manager, we should not use open addressing, because BlockInfo structures are going to have variable size due to the variable replication factors. I think you are correct, though, that we could use a higher load factor than 0.5. How well it will work will depend on a few things. One very important thing is the quality of the hash function. We need good dispersion to avoid clustering and non-constant behavior. bq. How about write it as a configuration and default value is 0.5? Users don't need to change the default value it if they have big memory, but if the memory is limit? Good idea bq. in ProbingHashSet#getInternal ... We should call return null \[when slot == originalSlot\]. (Actually we will never reach there since it's at most half full) Fixed bq. ProbingHashSet#maintainCompactness... Yeah. I looked at this again and it was broken. I think what we should do instead is just call {{putInternal}} on each element with {{overwrite = false}}. Then if we find a key equal to the current one, we know that the element is already in the right slot. bq. Currently even we use slf4j, but in someplace we still do something like Long.toHexString(addr) and it will affect little performance. Can we check the log level in those places? I hate to put all those if statements in, but I think you're probably right. I also fixed one or two cases where I wasn't calling {{Long.toHexString}} on an address. I wish slf4j supported formatting strings. bq. Unnessary import in ProbingHashSet, MemoryManager Fixed bq. 6. typos. Fixed > Create an off-heap hash table implementation > > > Key: HDFS-7844 > URL: https://issues.apache.org/jira/browse/HDFS-7844 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7836 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-7844-scl.001.patch > > > Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7875) Improve log message when wrong value configured for dfs.datanode.failed.volumes.tolerated
[ https://issues.apache.org/jira/browse/HDFS-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350793#comment-14350793 ] Allen Wittenauer commented on HDFS-7875: Let's put a space in between the . and Value. There is also extraneous space at the end of that line. > Improve log message when wrong value configured for > dfs.datanode.failed.volumes.tolerated > -- > > Key: HDFS-7875 > URL: https://issues.apache.org/jira/browse/HDFS-7875 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: nijel >Assignee: nijel >Priority: Trivial > Attachments: 0001-HDFS-7875.patch, 0002-HDFS-7875.patch > > > By mistake i configured dfs.datanode.failed.volumes.tolerated equal to the > number of volume configured. Got stuck for some time in debugging since the > log message didn't give much details. > The log message can be more detail. Added a patch with change in message. > Please have a look -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350786#comment-14350786 ] Alejandro Abdelnur commented on HDFS-6200: -- Haohui, Could you please list the actual set of dependencies the hdfs-client will carry? > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350783#comment-14350783 ] Haohui Mai edited comment on HDFS-6200 at 3/6/15 7:41 PM: -- Thanks tucu. Just to clarify -- I'm not trashing the classloader solution, I agree that it has its own values on yarn/mr side. I don't see them as competing solutions, they provide values in different use cases. I think we don't need to mix the two issues. was (Author: wheat9): Thanks touch. Just to clarify -- I'm not trashing the classloader solution, I agree that it has its own values on yarn/mr side. I don't see them as competing solutions, they provide values in different use cases. I think we don't need to mix the two issues. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350783#comment-14350783 ] Haohui Mai commented on HDFS-6200: -- Thanks touch. Just to clarify -- I'm not trashing the classloader solution, I agree that it has its own values on yarn/mr side. I don't see them as competing solutions, they provide values in different use cases. I think we don't need to mix the two issues. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350774#comment-14350774 ] Alejandro Abdelnur commented on HDFS-6200: -- Haohui, Doing what hadoop-client wont solve the problems you want to tackle, it will just remove the JARs used on the HDFS server side only. If you just care about those server side dependencies, hadoop-client should be enough and you could exclude YARN/MR artifacts in your dependency. If you want take care of guava, commons-*, etc, etc, you'll need to classloader magic for the filesystem impls, and this should be done in common where the Hadoop FileSystem API lives so all Hadoop FileSystem implementations get this kind of isolation. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7893) Update the POM to create a separate hdfs-client jar
[ https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350759#comment-14350759 ] Jing Zhao commented on HDFS-7893: - The patch looks good to me. Some comments: # maybe we should call it "Apache Hadoop HDFS Client"? {code} + Apache Hadoop HDFS + Apache Hadoop HDFS {code} # we can also add this dependency to hadoop-hdfs-nfs. > Update the POM to create a separate hdfs-client jar > --- > > Key: HDFS-7893 > URL: https://issues.apache.org/jira/browse/HDFS-7893 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7893.000.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350769#comment-14350769 ] Hadoop QA commented on HDFS-7435: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702905/HDFS-7435.patch against trunk revision 95bfd08. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer org.apache.hadoop.hdfs.TestSetrepIncreasing org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.datanode.TestSimulatedFSDataset The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//console This message is automatically generated. > PB encoding of block reports is very inefficient > > > Key: HDFS-7435 > URL: https://issues.apache.org/jira/browse/HDFS-7435 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, > HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch, > HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch > > > Block reports are encoded as a PB repeating long. Repeating fields use an > {{ArrayList}} with default capacity of 10. A block report containing tens or > hundreds of thousand of longs (3 for each replica) is extremely expensive > since the {{ArrayList}} must realloc many times. Also, decoding repeating > fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350765#comment-14350765 ] Haohui Mai commented on HDFS-6200: -- bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. Don't you agree we need a client jar? I see you point. This jira, however, is about creating the client jar. Everything below the client jar is implementation detail. I don't think it need to be mixed with this jira. bq. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain I don't agree. We did that for {{hadoop-client}}, which is available today. You're more than welcome to contribute and to clean things up. We've been hit really hard on resolving dependency conflicts in Oozie (which uses tomcat's classloader), Ranger (depends on different version of jersey-server), Spark (has a conflicting version of asm). We need a client jar whose dependency can be carefully and explicitly managed. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350765#comment-14350765 ] Haohui Mai edited comment on HDFS-6200 at 3/6/15 7:27 PM: -- bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. Don't you agree we need a client jar? I see you point. This jira, however, is about creating the client jar. Everything below the client jar is implementation detail. I don't think it need to be mixed with this jira. bq. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain I don't agree. We did that for {{hadoop-client}}, which is available today. You're more than welcome to contribute and to clean things up. We've been hit really hard on resolving dependency conflicts in Oozie (which uses tomcat's classloader), Ranger (depends on different version of jersey-server), Spark (has a conflicting version of asm). A clean solution to fix all the problems is appreciated. was (Author: wheat9): bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. Don't you agree we need a client jar? I see you point. This jira, however, is about creating the client jar. Everything below the client jar is implementation detail. I don't think it need to be mixed with this jira. bq. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain I don't agree. We did that for {{hadoop-client}}, which is available today. You're more than welcome to contribute and to clean things up. We've been hit really hard on resolving dependency conflicts in Oozie (which uses tomcat's classloader), Ranger (depends on different version of jersey-server), Spark (has a conflicting version of asm). We need a client jar whose dependency can be carefully and explicitly managed. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7854) Separate class DataStreamer out of DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350764#comment-14350764 ] Hadoop QA commented on HDFS-7854: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703032/HDFS-7854-002.patch against trunk revision 24db081. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9783//console This message is automatically generated. > Separate class DataStreamer out of DFSOutputStream > -- > > Key: HDFS-7854 > URL: https://issues.apache.org/jira/browse/HDFS-7854 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-7854-001.patch, HDFS-7854-002.patch > > > This sub task separate DataStreamer from DFSOutputStream. New DataStreamer > will accept packets and write them to remote datanodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()
[ https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350760#comment-14350760 ] Hadoop QA commented on HDFS-7261: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703042/HDFS-7261-001.patch against trunk revision 24db081. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9782//console This message is automatically generated. > storageMap is accessed without synchronization in > DatanodeDescriptor#updateHeartbeatState() > --- > > Key: HDFS-7261 > URL: https://issues.apache.org/jira/browse/HDFS-7261 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ted Yu >Assignee: Brahma Reddy Battula > Attachments: HDFS-7261-001.patch, HDFS-7261.patch > > > Here is the code: > {code} > failedStorageInfos = new HashSet( > storageMap.values()); > {code} > In other places, the lock on "DatanodeDescriptor.storageMap" is held: > {code} > synchronized (storageMap) { > final Collection storages = storageMap.values(); > return storages.toArray(new DatanodeStorageInfo[storages.size()]); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client
[ https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350732#comment-14350732 ] Hudson commented on HDFS-7885: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7271 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7271/]) HDFS-7885. Datanode should not trust the generation stamp provided by client. Contributed by Tsz Wo Nicholas Sze. (jing9: rev 24db0812be64e83a48ade01fc1eaaeaedad4dec0) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Datanode should not trust the generation stamp provided by client > - > > Key: HDFS-7885 > URL: https://issues.apache.org/jira/browse/HDFS-7885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: vitthal (Suhas) Gogate >Assignee: Tsz Wo Nicholas Sze >Priority: Critical > Fix For: 2.7.0 > > Attachments: h7885_20150305.patch, h7885_20150306.patch > > > Datanode should not trust the generation stamp provided by client, since it > is prefetched and buffered in client, and concurrent append may increase it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7894) Rolling upgrade readiness is not updated in jmx until query command is issued.
[ https://issues.apache.org/jira/browse/HDFS-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350728#comment-14350728 ] Kihwal Lee commented on HDFS-7894: -- It won't work because of {{checkSuperuserPrivilege()}} and {{checkOperation()}}. What do you think about something like following? I didn't try to compile or test this code. Adding a test case would be nice, if possible. {code:java} if (!isRollingUpgrade()) { return null; // this is the common case. } readLock(); // check again after acquiring the read lock. RollingUpgradeInfo upgradeInfo = getRollingUpgradeInfo(); if (upgradeInfo == null) { return null; } try { boolean hasRollbackImage = this.getFSImage().hasRollbackFSImage(); upgradeInfo.setCreatedRollbackImages(hasRollbackImage); } finally { readUnlock(); } return new RollingUpgradeInfo.Bean(upgradeInfo); {code} > Rolling upgrade readiness is not updated in jmx until query command is issued. > -- > > Key: HDFS-7894 > URL: https://issues.apache.org/jira/browse/HDFS-7894 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Brahma Reddy Battula >Priority: Critical > Attachments: HDFS-7894.patch > > > When a hdfs rolling upgrade is started and a rollback image is > created/uploaded, the active NN does not update its {{rollingUpgradeInfo}} > until it receives a query command via RPC. This results in inconsistent info > being showing up in the web UI and its jmx page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350718#comment-14350718 ] Konstantin Shvachko commented on HDFS-7886: --- Forgot to mention 3. We should keep SEED = 100 4. Add replica printout to the assert in {{BlockListAsLongs}}, which helps debugging and does not effect runtime {{"Must be under-construction replica: " + r;}} > TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes > > > Key: HDFS-7886 > URL: https://issues.apache.org/jira/browse/HDFS-7886 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.0 >Reporter: Yi Liu >Assignee: Plamen Jeliazkov >Priority: Minor > Attachments: HDFS-7886.patch > > > https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7885) Datanode should not trust the generation stamp provided by client
[ https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7885: Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the fix, Nicholas! I've committed this to trunk and branch-2. > Datanode should not trust the generation stamp provided by client > - > > Key: HDFS-7885 > URL: https://issues.apache.org/jira/browse/HDFS-7885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: vitthal (Suhas) Gogate >Assignee: Tsz Wo Nicholas Sze >Priority: Critical > Fix For: 2.7.0 > > Attachments: h7885_20150305.patch, h7885_20150306.patch > > > Datanode should not trust the generation stamp provided by client, since it > is prefetched and buffered in client, and concurrent append may increase it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client
[ https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350712#comment-14350712 ] Jing Zhao commented on HDFS-7885: - The latest patch looks good to me. The failed tests should be unrelated. +1. I will commit it shortly. > Datanode should not trust the generation stamp provided by client > - > > Key: HDFS-7885 > URL: https://issues.apache.org/jira/browse/HDFS-7885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: vitthal (Suhas) Gogate >Assignee: Tsz Wo Nicholas Sze >Priority: Critical > Attachments: h7885_20150305.patch, h7885_20150306.patch > > > Datanode should not trust the generation stamp provided by client, since it > is prefetched and buffered in client, and concurrent append may increase it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350711#comment-14350711 ] Sean Busbey commented on HDFS-6200: --- As I mentioned earlier, the dependencies your client artifact brings with it is a defining part of the interface you are exposing downstream applications to. That means we need the ability to manipulate those dependencies, even if we're only going to do so at a later date. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. I only mention the internal artifact as an alternative if having DFSClient live in hadoop-hdfs is undesirable. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain. However, there might be other mitigating factors I'm not aware of that make breaking the code into a new module desirable. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350684#comment-14350684 ] Haohui Mai edited comment on HDFS-6200 at 3/6/15 6:45 PM: -- bq. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. bq. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. What about (1) hiding implementation in local package when possible? (2) marking it as private class as what we did today when the previous option is unavailable? I don't think it is the time to create yet another artifact right now. There are quite a bit of overheads associated with it. I'm yet to see this is justified. If it is indeed required we can do it after hdfs-client is separated out. was (Author: wheat9): bq. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. bq. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. What is the point of creating yet another internal jar if you can simply hide {{DFSClient}} in local package? > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7896) HDFS Slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350697#comment-14350697 ] Chris Nauroth commented on HDFS-7896: - bq. Chris Nauroth recently added failed volume reporting via HDFS-7604. Ideally we can extend that reporting infrastructure. Yes, I think that will work. We can add slow disk information to the {{VolumeFailureSummaryProto}} message. That will ride along in heartbeats, and we can add corresponding metrics and web UI fields. > HDFS Slow disk detection > > > Key: HDFS-7896 > URL: https://issues.apache.org/jira/browse/HDFS-7896 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Arpit Agarwal > > HDFS should detect slow disks. To start with we can flag this information via > the NameNode web UI. Alternatively DNs can avoid using slow disks for writes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350686#comment-14350686 ] Konstantin Shvachko commented on HDFS-7886: --- I traced the latest failure with Plamen's fix. It actually points to the next test case {{testTruncateWithDataNodesShutdownImmediately()}}. 1. I think we need to add {{checkBlockRecovery()}} after restarting the DataNodes, and check the file length before deleting. Otherwise {{testTruncateWithDataNodesShutdownImmediately()}} seems incomplete without checking anything. I did that, but then {{testCopyOnTruncateWithDataNodesRestart()}} fails. The symptom is the same - the assert error, but I think there may be a race condition between block recovery, which starts after the first block report and the second block report, which is explicitly triggered in the test. Yi, I don't think we need to trigger block reports as restarting node will send one immediately after restarting. Triggering causes the second block report. 2. I think we can fix the test by removing {{triggerBlockReports()}} after restarting DNs. But we still need to investigate the potential race between block recovery and block reporting. In a different jira probably. So I think Plamen's fix is right it just didn't cover all test cases. I know it is time consuming, because you need to run it several times before it fails - the nature of randomized tests. By adding waits on expected conditions we make it more deterministic. > TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes > > > Key: HDFS-7886 > URL: https://issues.apache.org/jira/browse/HDFS-7886 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.0 >Reporter: Yi Liu >Assignee: Plamen Jeliazkov >Priority: Minor > Attachments: HDFS-7886.patch > > > https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350684#comment-14350684 ] Haohui Mai commented on HDFS-6200: -- bq. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. bq. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. What is the point of creating yet another internal jar if you can simply hide {{DFSClient}} in local package? > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client
[ https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350651#comment-14350651 ] Hadoop QA commented on HDFS-7885: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702968/h7885_20150306.patch against trunk revision 95bfd08. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFileTruncate The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9775//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9775//console This message is automatically generated. > Datanode should not trust the generation stamp provided by client > - > > Key: HDFS-7885 > URL: https://issues.apache.org/jira/browse/HDFS-7885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: vitthal (Suhas) Gogate >Assignee: Tsz Wo Nicholas Sze >Priority: Critical > Attachments: h7885_20150305.patch, h7885_20150306.patch > > > Datanode should not trust the generation stamp provided by client, since it > is prefetched and buffered in client, and concurrent append may increase it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7901) Fix findbug warning in org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
[ https://issues.apache.org/jira/browse/HDFS-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula resolved HDFS-7901. Resolution: Duplicate > Fix findbug warning in > org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset() > --- > > Key: HDFS-7901 > URL: https://issues.apache.org/jira/browse/HDFS-7901 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > {noformat} > > Bug type DM_NUMBER_CTOR (click for details) > In class org.apache.hadoop.hdfs.web.resources.OffsetParam > In method org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset() > Called method new Long(long) > Should call Long.valueOf(long) instead > At OffsetParam.java:[line 52] > {noformat} > https://builds.apache.org/job/PreCommit-HDFS-Build/9767//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7901) Fix findbug warning in org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
Brahma Reddy Battula created HDFS-7901: -- Summary: Fix findbug warning in org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset() Key: HDFS-7901 URL: https://issues.apache.org/jira/browse/HDFS-7901 Project: Hadoop HDFS Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula {noformat} Bug type DM_NUMBER_CTOR (click for details) In class org.apache.hadoop.hdfs.web.resources.OffsetParam In method org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset() Called method new Long(long) Should call Long.valueOf(long) instead At OffsetParam.java:[line 52] {noformat} https://builds.apache.org/job/PreCommit-HDFS-Build/9767//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350491#comment-14350491 ] Brahma Reddy Battula commented on HDFS-7818: can I close HDFS-7901..? Please let me know,thanks.. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350465#comment-14350465 ] Hudson commented on HDFS-7855: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2074 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2074/]) HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. (jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java > Separate class Packet from DFSOutputStream > -- > > Key: HDFS-7855 > URL: https://issues.apache.org/jira/browse/HDFS-7855 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsclient >Reporter: Li Bo >Assignee: Li Bo > Fix For: 2.7.0 > > Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, > HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, > HDFS-7855-006.patch, HDFS-7855-007.patch > > > Class Packet is an inner class in DFSOutputStream and also used by > DataStreamer. This sub task separates Packet out of DFSOutputStream to aid > the separation in HDFS-7854. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350453#comment-14350453 ] Hadoop QA commented on HDFS-7818: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703065/HDFS-7818.v5.txt against trunk revision 95bfd08. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9780//console This message is automatically generated. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map
[ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350444#comment-14350444 ] Hadoop QA commented on HDFS-7433: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch against trunk revision 95bfd08. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9779//console This message is automatically generated. > Optimize performance of DatanodeManager's node map > -- > > Key: HDFS-7433 > URL: https://issues.apache.org/jira/browse/HDFS-7433 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch > > > The datanode map is currently a {{TreeMap}}. For many thousands of > datanodes, tree lookups are ~10X more expensive than a {{HashMap}}. > Insertions and removals are up to 100X more expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map
[ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350430#comment-14350430 ] Hadoop QA commented on HDFS-7433: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch against trunk revision 95bfd08. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9778//console This message is automatically generated. > Optimize performance of DatanodeManager's node map > -- > > Key: HDFS-7433 > URL: https://issues.apache.org/jira/browse/HDFS-7433 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch > > > The datanode map is currently a {{TreeMap}}. For many thousands of > datanodes, tree lookups are ~10X more expensive than a {{HashMap}}. > Insertions and removals are up to 100X more expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350427#comment-14350427 ] Hudson commented on HDFS-7855: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #124 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/124/]) HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. (jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java > Separate class Packet from DFSOutputStream > -- > > Key: HDFS-7855 > URL: https://issues.apache.org/jira/browse/HDFS-7855 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsclient >Reporter: Li Bo >Assignee: Li Bo > Fix For: 2.7.0 > > Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, > HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, > HDFS-7855-006.patch, HDFS-7855-007.patch > > > Class Packet is an inner class in DFSOutputStream and also used by > DataStreamer. This sub task separates Packet out of DFSOutputStream to aid > the separation in HDFS-7854. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated HDFS-7818: - Attachment: HDFS-7818.v5.txt Fixing findbugs warning and updating patch to v5. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt, HDFS-7818.v5.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350412#comment-14350412 ] Hudson commented on HDFS-7855: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2056 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2056/]) HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. (jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Separate class Packet from DFSOutputStream > -- > > Key: HDFS-7855 > URL: https://issues.apache.org/jira/browse/HDFS-7855 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsclient >Reporter: Li Bo >Assignee: Li Bo > Fix For: 2.7.0 > > Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, > HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, > HDFS-7855-006.patch, HDFS-7855-007.patch > > > Class Packet is an inner class in DFSOutputStream and also used by > DataStreamer. This sub task separates Packet out of DFSOutputStream to aid > the separation in HDFS-7854. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map
[ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350415#comment-14350415 ] Hadoop QA commented on HDFS-7433: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch against trunk revision 95bfd08. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9777//console This message is automatically generated. > Optimize performance of DatanodeManager's node map > -- > > Key: HDFS-7433 > URL: https://issues.apache.org/jira/browse/HDFS-7433 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch > > > The datanode map is currently a {{TreeMap}}. For many thousands of > datanodes, tree lookups are ~10X more expensive than a {{HashMap}}. > Insertions and removals are up to 100X more expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350408#comment-14350408 ] Hudson commented on HDFS-7855: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/115/]) HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. (jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java > Separate class Packet from DFSOutputStream > -- > > Key: HDFS-7855 > URL: https://issues.apache.org/jira/browse/HDFS-7855 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsclient >Reporter: Li Bo >Assignee: Li Bo > Fix For: 2.7.0 > > Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, > HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, > HDFS-7855-006.patch, HDFS-7855-007.patch > > > Class Packet is an inner class in DFSOutputStream and also used by > DataStreamer. This sub task separates Packet out of DFSOutputStream to aid > the separation in HDFS-7854. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6806) HDFS Rolling upgrade document should mention the versions available
[ https://issues.apache.org/jira/browse/HDFS-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350403#comment-14350403 ] Hadoop QA commented on HDFS-6806: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703011/HDFS-6806.3.patch against trunk revision 95bfd08. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestAppendSnapshotTruncate Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9772//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9772//console This message is automatically generated. > HDFS Rolling upgrade document should mention the versions available > --- > > Key: HDFS-6806 > URL: https://issues.apache.org/jira/browse/HDFS-6806 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Akira AJISAKA >Assignee: J.Andreina >Priority: Minor > Labels: newbie > Attachments: HDFS-6806.1.patch, HDFS-6806.2.patch, HDFS-6806.3.patch > > > We should document that rolling upgrades do not support upgrades from ~2.3 to > 2.4+. It has been asked in the user ML many times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
[ https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated HDFS-7818: - Priority: Blocker (was: Critical) Target Version/s: 2.7.0 Marking as a blocker since this is a very common scenario when using webHDFS, and it hits the NPE every time. The only workaround is to use HDFS instead of the webHDFS interface, but that is not always an option when reading cross-colo or off grid. > DataNode throws NPE if the WebHdfs URL does not contain the offset parameter > > > Key: HDFS-7818 > URL: https://issues.apache.org/jira/browse/HDFS-7818 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Blocker > Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, > HDFS-7818.v4.txt > > > This is a regression in 2.7 and later. > {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not: > {code} > $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1 > ... output ... > $ hadoop fs -text webhdfs://myhost.com/tmp/test.1 > text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > null > at > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350369#comment-14350369 ] Sean Busbey commented on HDFS-6200: --- The dependencies you bring with you are an integral part of the interface you define for downstream clients. While I agree that it can be a separate subtask, it has to be considered as part of how you structure the overall approach. {quote} Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS. {quote} Our own internal use of client interfaces isn't the same thing as downstream application uses. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. {noformat} hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal {noformat} > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350305#comment-14350305 ] Hudson commented on HDFS-7855: -- FAILURE: Integrated in Hadoop-Yarn-trunk #858 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/858/]) HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. (jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java > Separate class Packet from DFSOutputStream > -- > > Key: HDFS-7855 > URL: https://issues.apache.org/jira/browse/HDFS-7855 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: dfsclient >Reporter: Li Bo >Assignee: Li Bo > Fix For: 2.7.0 > > Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, > HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, > HDFS-7855-006.patch, HDFS-7855-007.patch > > > Class Packet is an inner class in DFSOutputStream and also used by > DataStreamer. This sub task separates Packet out of DFSOutputStream to aid > the separation in HDFS-7854. -- This message was sent by Atlassian JIRA (v6.3.4#6332)