date:20150306

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-06 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351449#comment-14351449
 ] 

Chris Douglas commented on HDFS-7411:
-

Looked through the patch; it addresses the feedback. [~szetszwo], do you want 
to review the patch before commit?

> Refactor and improve decommissioning logic into DecommissionManager
> ---
>
> Key: HDFS-7411
> URL: https://issues.apache.org/jira/browse/HDFS-7411
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.5.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
> hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
> hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
> hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to 
> DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7838) Expose truncate API for libhdfs

2015-03-06 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7838:
-
Attachment: HDFS-7838.002.patch

Thanks Colin for review, update the patch to address your comments.

{quote}
Also, can you add a stub hdfsTruncateFile function to libwebhdfs that returns 
ENOTSUP, and file a jira to add truncate support to libwebhdfs? or just 
implement it in this patch, your choice.
{quote}

I file HDFS-7902 for it, thanks.


> Expose truncate API for libhdfs
> ---
>
> Key: HDFS-7838
> URL: https://issues.apache.org/jira/browse/HDFS-7838
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.7.0
>
> Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch
>
>
> It's good to expose truncate in libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7902) Expose truncate API for libwebhdfs

2015-03-06 Thread Yi Liu (JIRA)

Yi Liu created HDFS-7902:


 Summary: Expose truncate API for libwebhdfs
 Key: HDFS-7902
 URL: https://issues.apache.org/jira/browse/HDFS-7902
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: native, webhdfs
Affects Versions: 2.7.0
Reporter: Yi Liu
Assignee: Yi Liu


As Colin suggested in HDFS-7838, we will add truncate support for libwebhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)

2015-03-06 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7312:

Labels:   (was: reviewed)

> Update DistCp v1 to optionally not use tmp location (branch-1 only)
> ---
>
> Key: HDFS-7312
> URL: https://issues.apache.org/jira/browse/HDFS-7312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 1.2.1
>Reporter: Joseph Prosser
>Assignee: Joseph Prosser
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, 
> HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9]
> NOTE: This is a branch-1 change only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)

2015-03-06 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7312:

Hadoop Flags: Reviewed

> Update DistCp v1 to optionally not use tmp location (branch-1 only)
> ---
>
> Key: HDFS-7312
> URL: https://issues.apache.org/jira/browse/HDFS-7312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 1.2.1
>Reporter: Joseph Prosser
>Assignee: Joseph Prosser
>Priority: Minor
>  Labels: reviewed
> Fix For: 1.3.0
>
> Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, 
> HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9]
> NOTE: This is a branch-1 change only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location (branch-1 only)

2015-03-06 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7312:

Labels: reviewed  (was: )

> Update DistCp v1 to optionally not use tmp location (branch-1 only)
> ---
>
> Key: HDFS-7312
> URL: https://issues.apache.org/jira/browse/HDFS-7312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 1.2.1
>Reporter: Joseph Prosser
>Assignee: Joseph Prosser
>Priority: Minor
>  Labels: reviewed
> Fix For: 1.3.0
>
> Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, 
> HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.008.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9]
> NOTE: This is a branch-1 change only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351384#comment-14351384
 ] 

stack commented on HDFS-7844:
-

Late to the game.

Patch is great. Creative.

All comments can be addressed later.

High level, any notion of difference in perf when comparing native to offheap 
to current implementation?

If we fail to pick up the configured memory manager (or the default), its worth 
a WARN log. Otherwise, folks may be confounded that they are getting the native 
memory manager though they asked for something else:

83  public static MemoryManager create(String name, Configuration conf) 
{
84String memoryManagerKey = conf.get(
85CommonConfigurationKeys.HADOOP_MEMORY_MANAGER_KEY,
86CommonConfigurationKeys.HADOOP_MEMORY_MANAGER_DEFAULT);
87if (memoryManagerKey == null) {
88  memoryManagerKey = NativeMemoryManager.class.getCanonicalName();
89}

This an arbitrary max?

 private final static long MAX_ADDRESS = 0x3fffL;

ByteArrayMemoryManager is just throwaway, testing? Otherwise, protect Log.TRACE 
with LOG.isTraceEnabled...

Its fun the way you did ByteArrayMemoryManager mapping address to a Map.

Ok, I see, BAMM is just for testing. Ignore above.

nit: make a method rather than dup the below...:

145 Entry entry = buffers.floorEntry(Long.valueOf(addr));
146 if (entry == null) {
147   throw new RuntimeException("Wrote to unallocated address 0x" +
148   Long.toHexString(addr));
149 }

method would return a byte array gotten from TreeMap... etc.

Is logging open at DEBUG but close at TRACE lead to confusion? Stumped debugger?

The lose has to let out an IOE? What is the caller going to do w/ this IOE?  
The ByteArrayMemoryManager close error string construction is same as close on 
ProbingHashTable?

Yeah man, put the Log.TRACE behind a test for TRACEyness.

I buy the compactness invariant.

I like the compromise put upon the Iterator (that resize is allowed while 
Iteration...) Seems appropriate given where this is to be deployed.

On TestMemoryManager, maybe parameterize so once through with 
ByteArrayMemoryManager and then a run with the offheap implementation rather 
than have dedicated test for each: 
https://github.com/junit-team/junit/wiki/Parameterized-tests

TestProbingHashTable is fun with its BlockInfo, etc., implementations.



















> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS

2015-03-06 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351346#comment-14351346
 ] 

Zhe Zhang commented on HDFS-6450:
-

Thanks for the suggestion Colin. Hedged pread is already handling 
BlockReaderLocal (by wrapping it as a Future). I guess it's reasonable to do 
the same for non-positional read too? 

Is it correct to understand hedged vs. non-hedged and positional vs. 
non-positional as orthogonal dimensions? If so, the only new requirement in 
hedged non-positional read is to utilize and maintain the states (pos, 
blockReader).

I'm still getting myself around this complex reader code. So please let me know 
if I missed something.

> Support non-positional hedged reads in HDFS
> ---
>
> Key: HDFS-6450
> URL: https://issues.apache.org/jira/browse/HDFS-6450
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Liang Xie
> Attachments: HDFS-6450-like-pread.txt
>
>
> HDFS-5776 added support for hedged positional reads.  We should also support 
> hedged non-position reads (aka regular reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.

2015-03-06 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351319#comment-14351319
 ] 

Allen Wittenauer commented on HDFS-5796:


(And, just to make clear the impact, this issue prevents us from upgrading 
Hadoop.)

> The file system browser in the namenode UI requires SPNEGO.
> ---
>
> Key: HDFS-5796
> URL: https://issues.apache.org/jira/browse/HDFS-5796
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Arun Suresh
>Priority: Blocker
> Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, 
> HDFS-5796.3.patch, HDFS-5796.3.patch
>
>
> After HDFS-5382, the browser makes webhdfs REST calls directly, requiring 
> SPNEGO to work between user's browser and namenode.  This won't work if the 
> cluster's security infrastructure is isolated from the regular network.  
> Moreover, SPNEGO is not supposed to be required for user-facing web pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7877) Support maintenance state for datanodes

2015-03-06 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-7877:
--
Attachment: HDFS-7877.patch
Supportmaintenancestatefordatanodes.pdf

Here are the initial design document and draft patch. Appreciate any input 
others might have.

To support maintenance state, we need to provide admins interface, manage the 
datanode state transitions and handle block related operations.

After we agree on the design, we can break the feature into subtasks.

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Ming Ma
> Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-06 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351313#comment-14351313
 ] 

Konstantin Shvachko commented on HDFS-7886:
---

More things that we've been looking at with Plamen. 
5. The race condition is in {{FsDatasetImpl.getBlockReports()}}, which collects 
the references to replicas under {{synchronizes}} section, but then constructs 
{{BlockListAsLongs}} outside of it. So if the recovery is triggered between 
them, then a replica can change its state. Here it changes from RUR to 
FINALIZED.
6. {{testTruncateWithDataNodesRestartImmediately()}} occasionally fails because 
block is recovered only on two DNs. This happens because NN does not know that 
two DNs were restarted and can schedule block recovery with a mixture of old 
(before the restart) and new (after the restart) locations. If the old location 
is used then recovery fails, because the DN have been restarted under a new 
address. {{waitActive()}} doesn't help here. We should somehow check that all 
new DNs have been registered and sent block reports.

> TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
> 
>
> Key: HDFS-7886
> URL: https://issues.apache.org/jira/browse/HDFS-7886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Plamen Jeliazkov
>Priority: Minor
> Attachments: HDFS-7886.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.

2015-03-06 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351307#comment-14351307
 ] 

Allen Wittenauer edited comment on HDFS-5796 at 3/7/15 1:53 AM:


bq. when security is enabled, WebHDFS by default picks up SPNEGO + 
KerberosAuthFilter. So the UI works, but only when the browser is launched 
after a kinit. If I don't do a kinit, I cannot browse files through the UI - 
this is the loss of functionality that is being discussed here?

No.  The key point in that summary is "by default".  If you need something that 
isn't the default, the whole system falls apart.  The fundamental problem is 
that if you use something like the AltKerberos filter, it flat out doesn't 
work.  There two key problems we've noticed:

a) filter parameters don't get passed down to either AltK's SPNEGO filter or a 
user's custom one
b) after we did some custom hacking, we noticed that cookie secret handling is 
broken.

Thus, using a browser to peruse HDFS with custom auth is completely broken in 
2.6  and up due to the removal of the old UI.

bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with 
PseudoAuthFilter, but then the UI as well as applications always thinks you are 
dr.who. So, I guess this is not acceptable?

No.  HDFS-5716 just flat doesn't work in practice due to the above issues. It 
isn't reflective of real world usage at all.  (.. and, believe me, we've tried 
to make it work without completely rewriting the built-in AltKerberos filter.)

There's a very high chance that HADOOP-10709 might actually fix our issues, but 
the person who was testing for me today went home ill. :(  So hopefully we'll 
try to verify on Monday.

bq. Dr. Who

I think Arun was thinking we needed to provide a 'default alternative', but I 
think we've cleared up that isn't actually necessary.  The 'default 
alternative' really is the AltKerberos filter that already ships with Hadoop.


was (Author: aw):
bq. when security is enabled, WebHDFS by default picks up SPNEGO + 
KerberosAuthFilter. So the UI works, but only when the browser is launched 
after a kinit. If I don't do a kinit, I cannot browse files through the UI - 
this is the loss of functionality that is being discussed here?

No.  The key point in that summary is "by default".  If you need something that 
isn't the default, the whole system falls apart.  The fundamental problem is 
that if you use something like the AltKerberos filter, it flat out doesn't 
work.  There two key problems we've noticed:

a) filter parameters don't get passed down to either AltK's SPNEGO filter or a 
user's custom one
b) after we did some custom hacking, we noticed that cookie secret handling is 
broken.

Thus, using a browser to peruse HDFS is completely broken in 2.6 and up due to 
the removal of the old UI.

bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with 
PseudoAuthFilter, but then the UI as well as applications always thinks you are 
dr.who. So, I guess this is not acceptable?

No.  HDFS-5716 just flat doesn't work in practice due to the above issues. It 
isn't reflective of real world usage at all.  (.. and, believe me, we've tried 
to make it work without completely rewriting the built-in AltKerberos filter.)

There's a very high chance that HADOOP-10709 might actually fix our issues, but 
the person who was testing for me today went home ill. :(  So hopefully we'll 
try to verify on Monday.

> The file system browser in the namenode UI requires SPNEGO.
> ---
>
> Key: HDFS-5796
> URL: https://issues.apache.org/jira/browse/HDFS-5796
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Arun Suresh
>Priority: Blocker
> Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, 
> HDFS-5796.3.patch, HDFS-5796.3.patch
>
>
> After HDFS-5382, the browser makes webhdfs REST calls directly, requiring 
> SPNEGO to work between user's browser and namenode.  This won't work if the 
> cluster's security infrastructure is isolated from the regular network.  
> Moreover, SPNEGO is not supposed to be required for user-facing web pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.

2015-03-06 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351307#comment-14351307
 ] 

Allen Wittenauer commented on HDFS-5796:


bq. when security is enabled, WebHDFS by default picks up SPNEGO + 
KerberosAuthFilter. So the UI works, but only when the browser is launched 
after a kinit. If I don't do a kinit, I cannot browse files through the UI - 
this is the loss of functionality that is being discussed here?

No.  The key point in that summary is "by default".  If you need something that 
isn't the default, the whole system falls apart.  The fundamental problem is 
that if you use something like the AltKerberos filter, it flat out doesn't 
work.  There two key problems we've noticed:

a) filter parameters don't get passed down to either AltK's SPNEGO filter or a 
user's custom one
b) after we did some custom hacking, we noticed that cookie secret handling is 
broken.

Thus, using a browser to peruse HDFS is completely broken in 2.6 and up due to 
the removal of the old UI.

bq. with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with 
PseudoAuthFilter, but then the UI as well as applications always thinks you are 
dr.who. So, I guess this is not acceptable?

No.  HDFS-5716 just flat doesn't work in practice due to the above issues. It 
isn't reflective of real world usage at all.  (.. and, believe me, we've tried 
to make it work without completely rewriting the built-in AltKerberos filter.)

There's a very high chance that HADOOP-10709 might actually fix our issues, but 
the person who was testing for me today went home ill. :(  So hopefully we'll 
try to verify on Monday.

> The file system browser in the namenode UI requires SPNEGO.
> ---
>
> Key: HDFS-5796
> URL: https://issues.apache.org/jira/browse/HDFS-5796
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Arun Suresh
>Priority: Blocker
> Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, 
> HDFS-5796.3.patch, HDFS-5796.3.patch
>
>
> After HDFS-5382, the browser makes webhdfs REST calls directly, requiring 
> SPNEGO to work between user's browser and namenode.  This won't work if the 
> cluster's security infrastructure is isolated from the regular network.  
> Moreover, SPNEGO is not supposed to be required for user-facing web pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7854) Separate class DataStreamer out of DFSOutputStream

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351298#comment-14351298
 ] 

Hadoop QA commented on HDFS-7854:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703032/HDFS-7854-002.patch
  against trunk revision 21101c0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1156 javac 
compiler warnings (more than the trunk's current 1155 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9785//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9785//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9785//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9785//console

This message is automatically generated.

> Separate class DataStreamer out of DFSOutputStream
> --
>
> Key: HDFS-7854
> URL: https://issues.apache.org/jira/browse/HDFS-7854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-7854-001.patch, HDFS-7854-002.patch
>
>
> This sub task separate DataStreamer from DFSOutputStream. New DataStreamer 
> will accept packets and write them to remote datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351290#comment-14351290
 ] 

Colin Patrick McCabe commented on HDFS-7844:


thanks, Charles.  I think we can cover most of that stuff (maybe not the 
curAddrss > MAX_ADDRESS part, but the others...) in a follow-on.  Good reviews 
by you and Yi.

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351282#comment-14351282
 ] 

Charles Lamb commented on HDFS-7844:


Thanks Colin, +1, I'll file a follow up jira for the coverage.

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead

2015-03-06 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351259#comment-14351259
 ] 

Lei (Eddy) Xu commented on HDFS-7758:
-

[~jpallas] What I am going to do is something like the this:

{code}
public static interface ReferredFsVolumeList extends Iterable, 
Closeable {
}

/**
   * Returns a list of volume references.
   *
   * The caller must release the reference of each volume by calling
   * {@link FsVolumeReference#close}.
   */
  public ReferredFsVolumeList getReferredVolumes();
{code}

In this way, findbugs should be able to capture it?

> Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
> -
>
> Key: HDFS-7758
> URL: https://issues.apache.org/jira/browse/HDFS-7758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch
>
>
> HDFS-7496 introduced reference-counting  the volume instances being used to 
> prevent race condition when hot swapping a volume.
> However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
> without increasing its reference count. In this JIRA, we retire the 
> {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
> and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
> of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead

2015-03-06 Thread Joe Pallas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351249#comment-14351249
 ] 

Joe Pallas commented on HDFS-7758:
--

[~cmccabe], is findbugs actually smart enough to figure out that an iterator is 
{{Closeable}} at call sites, which only see the {{Iterator}} interface?  Or 
would you need to define a new interface that extends both {{Iterator}} and 
{{Closable}}?

> Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
> -
>
> Key: HDFS-7758
> URL: https://issues.apache.org/jira/browse/HDFS-7758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch
>
>
> HDFS-7496 introduced reference-counting  the volume instances being used to 
> prevent race condition when hot swapping a volume.
> However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
> without increasing its reference count. In this JIRA, we retire the 
> {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
> and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
> of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7782) Read a striping layout file from client side

2015-03-06 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351243#comment-14351243
 ] 

Zhe Zhang commented on HDFS-7782:
-

I realized that the current {{DFSInputStream}} non-positional read gets at most 
64K at a time. Given that our striping cell size is 1M by default and is not 
likely to be configured < 64KB. It doesn't make much sense to maintain multiple 
blockReaders. 

> Read a striping layout file from client side
> 
>
> Key: HDFS-7782
> URL: https://issues.apache.org/jira/browse/HDFS-7782
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Zhe Zhang
> Attachments: HDFS-7782-000.patch, HDFS-7782-001.patch
>
>
> If client wants to read a file, he is not necessary to know and handle what 
> layout the file is. This sub task adds logic to DFSInputStream to support 
> reading striping layout files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351229#comment-14351229
 ] 

Haohui Mai commented on HDFS-7816:
--

{code}
+  throws IOException {
+URI uri;
+try {
+  uri = new URI(decoder.path());
+} catch (java.net.URISyntaxException e) {
+  throw new IOException("Invalid path:", e);
+}
{code}

For GC reasons it might make more sense to (1) take 
{{QueryStringDecoder.decodeComponent()}} and make some tweaks based on it, and 
it is okay to throw {{IllegalArgumentException}} directly (which will be 
translated to 400).

> Unable to open webhdfs paths with "+"
> -
>
> Key: HDFS-7816
> URL: https://issues.apache.org/jira/browse/HDFS-7816
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-7816.patch, HDFS-7816.patch
>
>
> webhdfs requests to open files with % characters in the filename fail because 
> the filename is not being decoded properly.  For example:
> $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def'
> cat: File does not exist: /user/somebody/abc%25def



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-06 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7853:

Attachment: HDFS-7853.002.patch

Update the patch:
# simplify BlockInfoStripedUC by removing the index array
# fix a bug in FSImageFormatPBINode when loading empty block array

Currently the patch still use an index array to distinguish LocatedBlock and 
LocatedStripedBlock. My main concern about using sentinel entries in 
LocatedBlock is only with sentinel entries we may have to check the file's 
state to identify the type of the block. It may be better to infer this from 
the LocatedBlock itself. (but it should also be fine to use sentinel blocks + 
an extra field to achieve this)

> Erasure coding: extend LocatedBlocks to support reading from striped files
> --
>
> Key: HDFS-7853
> URL: https://issues.apache.org/jira/browse/HDFS-7853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Jing Zhao
> Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch, 
> HDFS-7853.002.patch
>
>
> We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
> with striping layout (possibly an extra list specifying the index of each 
> location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

2015-03-06 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351218#comment-14351218
 ] 

Lei (Eddy) Xu commented on HDFS-7722:
-

[~cmccabe] Thanks for the review. I will make a patch to address your comments.

[~cnauroth] Yes, you are right on this one. Sure, I believe we can hold 
committing this. A review from you early next week would be much appreciated! 

To add some background, the rationale of this patch is providing user a 
convenient way to fix bad disks without touching configuration files, in the 
meantime, also preserving disk failure information for reporting purpose. 

bq.  I would like us to have some means to take corrective action and clear the 
volume failure information "online". 

For this concern, I suggest to have a following JIRA to let 
{{DataNode#parseChangedVolume}} to detect volumes that
* is not in {{FsVolumeList}}
* is not in {{DFS_DATANODE_DATA_DIR_KEYS}}
* and is in {{volumeFailureInfos}}

as {{DataNode#ChangedVolumes#deactiveLocations}}. So that the following logic 
can clear this failure info if the user _intents_ to do so.


> DataNode#checkDiskError should also remove Storage when error is found.
> ---
>
> Key: HDFS-7722
> URL: https://issues.apache.org/jira/browse/HDFS-7722
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, 
> HDFS-7722.002.patch
>
>
> When {{DataNode#checkDiskError}} found disk errors, it removes all block 
> metadatas from {{FsDatasetImpl}}. However, it does not removed the 
> corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
> The result is that, we could not directly run {{reconfig}} to hot swap the 
> failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351208#comment-14351208
 ] 

Haohui Mai commented on HDFS-7816:
--

Discussed with [~vinodkv] offline. we can go back to the old behavior in 
branch-2 and fix it in trunk. Thoughts?

> Unable to open webhdfs paths with "+"
> -
>
> Key: HDFS-7816
> URL: https://issues.apache.org/jira/browse/HDFS-7816
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-7816.patch, HDFS-7816.patch
>
>
> webhdfs requests to open files with % characters in the filename fail because 
> the filename is not being decoded properly.  For example:
> $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def'
> cat: File does not exist: /user/somebody/abc%25def



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with "+"

2015-03-06 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351192#comment-14351192
 ] 

Vinod Kumar Vavilapalli commented on HDFS-7816:
---

I may be being completely naive here, but how about we add a parameter or 
something which specifies RFCCompatibility and turn that off by default for 
older clients?

> Unable to open webhdfs paths with "+"
> -
>
> Key: HDFS-7816
> URL: https://issues.apache.org/jira/browse/HDFS-7816
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-7816.patch, HDFS-7816.patch
>
>
> webhdfs requests to open files with % characters in the filename fail because 
> the filename is not being decoded properly.  For example:
> $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def'
> cat: File does not exist: /user/somebody/abc%25def



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

2015-03-06 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351167#comment-14351167
 ] 

Chris Nauroth commented on HDFS-7722:
-

[~eddyxu], sorry I haven't had a chance to dig into this patch yet.  If I 
understand correctly, you're saying that removing a path from configuration and 
running reconfig will not clear volume failure information, but keeping the 
path in configuration, fixing the disk at that mount point and running reconfig 
will clear it.  Do I have it right?  I would like us to have some means to take 
corrective action and clear the volume failure information "online".  As long 
as that's still possible in some way, then it's probably sticking to the spirit 
of the code I wrote earlier.

Would you mind holding off the commit until early next week so I can take a 
closer look?  Thanks!

> DataNode#checkDiskError should also remove Storage when error is found.
> ---
>
> Key: HDFS-7722
> URL: https://issues.apache.org/jira/browse/HDFS-7722
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, 
> HDFS-7722.002.patch
>
>
> When {{DataNode#checkDiskError}} found disk errors, it removes all block 
> metadatas from {{FsDatasetImpl}}. However, it does not removed the 
> corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
> The result is that, we could not directly run {{reconfig}} to hot swap the 
> failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351163#comment-14351163
 ] 

Charles Lamb commented on HDFS-7844:


I applied your latest patch and set breakpoints at all of the exceptional 
throws in ByteArrayMemoryManager.java. Then I ran the unit test. The following 
lines did not trigger:

91, 94, 117, 129, 135, 165, 171, 190, 203, 245, 251.

I think those are the exceptions in allocate, free, one of the ones in 
putShort, and all of the throws in the getters.


> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.

2015-03-06 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351156#comment-14351156
 ] 

Vinod Kumar Vavilapalli commented on HDFS-5796:
---

Hey everyone,

I've been trying to understand the problem here, but it is a big wall of text. 
It'll be great if someone can help me. It seems like
 # when security is enabled, WebHDFS by default picks up SPNEGO + 
KerberosAuthFilter. So the UI works, but only when the browser is launched 
after a kinit. If I don't do a kinit, I cannot browse files through the UI - 
this is the loss of functionality that is being discussed here?
 # with HDFS-5716, you can turn the KerberosAuthFilter off and replace it with 
PseudoAuthFilter, but then the UI as well as applications always thinks you are 
dr.who. So, I guess this is not acceptable?
 # Is the patch trying to add (back) in a way to use KerberosAuthFilter for 
regular applications but use Dr.Who for browsers? And that is a security 
concern, so we don't want to put it back?

Going back to the title, "The file system browser in the namenode UI requires 
SPNEGO.". Seems like with HDFS-5716, you can set your own filter and so the 
discussion is really about the defaults?

Trying to gauge its priority for 2.7. Thanks.

> The file system browser in the namenode UI requires SPNEGO.
> ---
>
> Key: HDFS-5796
> URL: https://issues.apache.org/jira/browse/HDFS-5796
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Arun Suresh
>Priority: Blocker
> Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, 
> HDFS-5796.3.patch, HDFS-5796.3.patch
>
>
> After HDFS-5382, the browser makes webhdfs REST calls directly, requiring 
> SPNEGO to work between user's browser and namenode.  This won't work if the 
> cluster's security infrastructure is isolated from the regular network.  
> Moreover, SPNEGO is not supposed to be required for user-facing web pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351152#comment-14351152
 ] 

Colin Patrick McCabe commented on HDFS-7722:


Eddy and I had an offline discussion about the use of {{Set}} here.  It 
seems that there is a pervasive assumption elsewhere in the code that 
FsVolumeSpi instances are directories.  For example, in these interface methods:

{code}
  /** @return the base path to the volume */
  public String getBasePath();

  /** @return the path to the volume */
  public String getPath(String bpid) throws IOException;

  /** @return the directory for the finalized blocks in the block pool. */
  public File getFinalizedDir(String bpid) throws IOException;
{code}

So I think using {{Set}} is OK here for now, since it fits in with the 
rest of the code.  We will probably have to revisit this later, but it seems 
outside the scope of this jira.

One thing I really like about this patch is the fact we no longer hold the 
{{FsDatasetImpl}} mutex while scanning every volume.  This alone is a very 
important improvement.

I think it makes sense to leave the failure information around when removing 
volumes due to the disk checker. 

{code}
685 LOG.info("Deactivating volumes: " +
686 Joiner.on(",").join(absoluteVolumePaths));
{code}

We should print out the value of {{clearFailure}} here.

+1 once that's addressed.

> DataNode#checkDiskError should also remove Storage when error is found.
> ---
>
> Key: HDFS-7722
> URL: https://issues.apache.org/jira/browse/HDFS-7722
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, 
> HDFS-7722.002.patch
>
>
> When {{DataNode#checkDiskError}} found disk errors, it removes all block 
> metadatas from {{FsDatasetImpl}}. However, it does not removed the 
> corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
> The result is that, we could not directly run {{reconfig}} to hot swap the 
> failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351138#comment-14351138
 ] 

Hudson commented on HDFS-6488:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7277 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7277/])
HDFS-6488. Support HDFS superuser in NFSv3 gateway. Contributed by Brandon Li 
(brandonli: rev 0f8ecb1d0ce6d3ee9a7caf5b15b299210c2b8875)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/conf/NfsConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md


> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Fix For: 2.7.0
>
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351122#comment-14351122
 ] 

Colin Patrick McCabe commented on HDFS-7844:


bq. Several lines bust the 80 char limit.

Thanks, I fixed a few cases.

bq. What happens if someone runs this with the -d32 to the jvm? Do we need to 
make that check and throw accordingly?

I don't think this would be a problem.  We don't care about big java references 
are, since we're not using them.  Even 32-bit machines should support 
{{Unsafe#getLong}}-- if necessary, through two 32-bit operations.

bq. A small enhancement might be: close(boolean force) which will close 
unconditionally.

I realize this is maybe a bit confusing, but we don't ever want to close with 
entries remaining in the hash table.  The reason is because the caller is 
responsible for managing that memory.  We wouldn't know what to do with it, so 
there would be a memory leak.

Another thing to note is that in general, {{ProbingHashTable#close}} is really 
only a unit test thing.  In real life, we would never actually close the 
BlocksMap in the NameNode... we'd just shut down the whole process when someone 
control-Cs.  So we don't have to worry about how long close takes :)

bq. The line in #getSlot which is hash = -hash is in fact tested by your unit 
tests, but I don't think it's tested by design in the test. You might want to 
put in an explicit test for that particular line.

Believe me, without that line, the unit tests don't work.  Ask me how I know. :)

bq. expandTable: using catch(Throwable) feels like a rather wide net to cast, 
but I guess it's the right thing. I debated whether all you needed was catch 
(Error), but I guess you can't be sure that the callers above you won't just 
"keep going" after some RuntimeException gets into their hands.

It's a little bit of paranoia on my part.  Really, there should be no 
exceptions at all coming from that code, but given that this is Java, we can 
never actually guarantee that.  Even methods that aren't declared to throw a 
particular exception can throw it, through the magic of classloaders.  I wanted 
to be able to guarantee that there were no memory leaks, and this was the only 
way.  And no, we can't rethrow the {{Throwable}} itself, because then Java 
complains that the function isn't declared to throw {{Throwable}}.  So we wrap 
it in a {{RuntimeException}}.

bq. The comment for #capacity() "total number of slots" is either misleading or 
wrong.

Thanks.  Let's just replace this with an accessor for {{numSlots}}.  Now that 
the load factor is configurable, "capacity" is kind of a confusing term.

bq. any reason not to have get/putShort along with the existing byte/int/long?

Good idea, let's add that

bq. Should #toString() be declared as...

If you mean the toString function in the interfaces, everything in an interface 
is always public.  And putting @Override there is not needed.  The other places 
are already public and have @Override.

bq. [MemoryManager]  comments say nothing about whether it's thread safe or 
not. Ditto for ByteArrayMemoryManager.

Let me add a comment to the base class JavaDoc.

bq. There is no test coverage for the failure case of {{BAMM#close}}

added

bq. Why does curAddress start at 1000?

It can start at any address other than 0.

bq. For all of the put/get/byte/int/long routines, it wouldn't be hard to move 
all of the if() { throw new RuntimeException } snippits into their own routine. 
Maybe that's not worth the trouble, but if feels like there's a lot of repeated 
code.

I think this is not worth the trouble.  Maybe later.

bq. The indentation of #testMemoryManagerCreate formals is messed up.

ok

bq. testCatchInvalidPuts: you test putByte against freed memory, but not int or 
long.

yeah let's test them all

bq. the Assert.fail messages should be different for each fail() call.

It's fine.  There is a line number.

bq. I tried running TestMemoryManager.testNativeMemoryManagerCreate and it 
failed like this:

This was bugged after I added the "name" argument to the MemoryManager.  Should 
be fixed now.

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7844:
---
Attachment: HDFS-7844-scl.003.patch

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Brandon Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351115#comment-14351115
 ] 

Brandon Li edited comment on HDFS-6488 at 3/6/15 11:28 PM:
---

Thank you, Stephen, Colin, Akira and Jing. I've updated the title and committed 
the patch.


was (Author: brandonli):
Thank you, Stephen, Colin and Jing. I've updated the title and committed the 
patch.

> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Fix For: 2.7.0
>
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified

2015-03-06 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351118#comment-14351118
 ] 

Eric Payne commented on HDFS-7818:
--

Thank you [~wheat9]

> OffsetParam should return the default value instead of throwing NPE when the 
> value is unspecified
> -
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6488:
-
Fix Version/s: 2.7.0

> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Fix For: 2.7.0
>
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6488:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Brandon Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351115#comment-14351115
 ] 

Brandon Li commented on HDFS-6488:
--

Thank you, Stephen, Colin and Jing. I've updated the title and committed the 
patch.

> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-03-06 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6488:
-
Issue Type: New Feature  (was: Bug)

> HDFS superuser unable to access user's Trash files using NFSv3 mount
> 
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6488) Support HDFS superuser in NFSv3 gateway

2015-03-06 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6488:
-
Summary: Support HDFS superuser in NFSv3 gateway  (was: HDFS superuser 
unable to access user's Trash files using NFSv3 mount)

> Support HDFS superuser in NFSv3 gateway
> ---
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-03-06 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351099#comment-14351099
 ] 

Jing Zhao commented on HDFS-6488:
-

I think it should be ok to use a configuration prop to specify the nfs super 
user. The latest patch looks good to me. +1.

> HDFS superuser unable to access user's Trash files using NFSv3 mount
> 
>
> Key: HDFS-6488
> URL: https://issues.apache.org/jira/browse/HDFS-6488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.3.0
>Reporter: Stephen Chu
>Assignee: Brandon Li
> Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
> HDFS-6488.003.patch
>
>
> As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
> /user/schu/.Trash directory:
> {code}
> bash-4.1$ cd .Trash/
> bash: cd: .Trash/: Permission denied
> bash-4.1$ ls -la
> total 2
> drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
> drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
> drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
> drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
> bash-4.1$ ls .Trash
> ls: cannot open directory .Trash: Permission denied
> bash-4.1$
> {code}
> When using FsShell as hdfs superuser, I have superuser permissions to schu's 
> .Trash contents:
> {code}
> bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user
> drwx--   - schu supergroup  0 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu
> -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
> /user/schu/.Trash/Current/user/schu/tf1
> {code}
> The NFSv3 logs don't produce any error when superuser tries to access schu 
> Trash contents. However, for other permission errors (e.g. schu tries to 
> delete a directory owned by hdfs), there will be a permission error in the 
> logs.
> I think this is not specific to the .Trash directory perhaps.
> I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
> When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
> I get the same permission denied.
> {code}
> [schu@hdfs-nfs ~]$ hdfs dfs -ls
> Found 4 items
> drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
> drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
> -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
> drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
> bash-4.1$ whoami
> hdfs
> bash-4.1$ pwd
> /hdfs_nfs_mount/user/schu
> bash-4.1$ cd dir1
> bash: cd: dir1: Permission denied
> bash-4.1$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7893) Update the POM to create a separate hdfs-client jar

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351094#comment-14351094
 ] 

Hadoop QA commented on HDFS-7893:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703160/HDFS-7893.001.patch
  against trunk revision 27e8ea8.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9786//console

This message is automatically generated.

> Update the POM to create a separate hdfs-client jar
> ---
>
> Key: HDFS-7893
> URL: https://issues.apache.org/jira/browse/HDFS-7893
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-7893.000.patch, HDFS-7893.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7893) Update the POM to create a separate hdfs-client jar

2015-03-06 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7893:
-
Attachment: HDFS-7893.001.patch

> Update the POM to create a separate hdfs-client jar
> ---
>
> Key: HDFS-7893
> URL: https://issues.apache.org/jira/browse/HDFS-7893
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-7893.000.patch, HDFS-7893.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351073#comment-14351073
 ] 

Charles Lamb commented on HDFS-7844:


[~cmccabe],

This is a nice piece of work!

Here are some comments:

General:

Several lines bust the 80 char limit.

Many unused imports throughout. I guess Yi got this already.

What happens if someone runs this with the -d32 to the jvm? Do we need to make 
that check and throw accordingly?

ProbingHashSet.java:

A small enhancement might be: {code}close(boolean force){code} which will close 
unconditionally.

The line in #getSlot which is {code}hash = -hash{code} is in fact tested by 
your unit tests, but I don't think it's tested by design in the test. You might 
want to put in an explicit test for that particular line.

#expandTable: using {code}catch(Throwable){code} feels like a rather wide net 
to cast, but I guess it's the right thing. I debated whether all you needed was 
catch (Error), but I guess you can't be sure that the callers above you won't 
just "keep going" after some RuntimeException gets into their hands.

The comment for #capacity() "total number of slots" is either misleading or 
wrong.

MemoryManager.java

any reason not to have get/putShort along with the existing byte/int/long?

Should #toString() be declared as {code}@Override public String toString(){code}

NativeMemoryManager.java

The comments say nothing about whether it's thread safe or not. Ditto for 
ByteArrayMemoryManager.

ByteArrayMemoryManager

There is no test coverage for the failure case of {code}BAMM.close(){code}

s/valiation/validation/ (Yi caught this)

Why does curAddress start at 1000?

s/2^^31/2^31/

For all of the put/get/byte/int/long routines, it wouldn't be hard to move all 
of the {code}if() { throw new RuntimeException }{code} snippits into their own 
routine. Maybe that's not worth the trouble, but if feels like there's a lot of 
repeated code.

TestMemoryManager.java

The indentation of #testMemoryManagerCreate formals is messed up.

#testCatchInvalidPuts: you test putByte against freed memory, but not int or 
long.

the Assert.fail messages should be different for each fail() call.

The exception checks in getByte/Int/Long are not tested.

None of the entry==null exceptions are tested in putByte/Long/Int

I tried running  TestMemoryManager.testNativeMemoryManagerCreate and it failed 
like this:

{code}
2015-03-06 17:10:22,430 ERROR offheap.MemoryManager$Factory 
(MemoryManager.java:create(91)) - Unable to create 
org.apache.hadoop.util.offheap.NativeMemoryManager.  Falling back on 
org.apache.hadoop.util.offheap.ByteArrayMemoryManager
java.lang.IllegalArgumentException: wrong number of arguments
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.util.offheap.MemoryManager$Factory.create(MemoryManager.java:89)
at 
org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:135)
at 
org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

org.junit.ComparisonFailure: 
Expected :org.apache.hadoop.util.offheap.NativeMemoryManager
Actual   :org.apache.hadoop.util.offheap.ByteArrayMemoryManager
 
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:137)
at 
org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.De

[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion

2015-03-06 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351061#comment-14351061
 ] 

Jing Zhao commented on HDFS-7857:
-

+1. Thanks for the improvement, [~yzhangal]!

> Incomplete information in WARN message caused user confusion
> 
>
> Key: HDFS-7857
> URL: https://issues.apache.org/jira/browse/HDFS-7857
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: supportability
> Attachments: HDFS-7857.001.patch
>
>
> Lots of the following messages appeared in NN log:
> {quote}
> 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: 
> Auth failed for :39838:null (DIGEST-MD5: IO error acquiring 
> password)
> 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> ..
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> :39843:null (DIGEST-MD5: IO error acquiring password)
> 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> {quote}
> The real reason of failure is the second message about StandbyException,
> However, the first message is confusing because it talks about "DIGEST-MD5: 
> IO error acquiring password".
> Filing this jira to modify the first message to have more comprehensive 
> information that can be obtained from {{getCauseForInvalidToken(e)}}.
> {code}
>try {
>   saslResponse = processSaslMessage(saslMessage);
> } catch (IOException e) {
>   rpcMetrics.incrAuthenticationFailures();
>   // attempting user could be null
>   AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
>   + attemptingUser + " (" + e.getLocalizedMessage() + ")");
>   throw (IOException) getCauseForInvalidToken(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351058#comment-14351058
 ] 

Colin Patrick McCabe commented on HDFS-7758:


{code}
  /**
   * Returns a list of volume references.
   *
   * The caller must release the reference of each volume by calling
   * {@link FsVolumeReference#close}.
   */
  public List getVolumeRefs();

  /** Returns a reference of a given volume, specified by the index. */
  public FsVolumeReference getVolumeRef(int idx) throws IOException;
{code}

This is still the wrong interface.  {{getVolumeRef(int)}} encourages people to 
assume that the number of volumes is never going to change.  What happens if it 
does?

Instead of doing this, let's have an {{Iterator}} that we can use.  Something 
like this:

{code}
public Iterator getVolumeRefIterator();

private static class FsVolumeRefIterator
  implements Iterator, Closeable {
  private final List list;

  private int idx = 0;

  FsVolumeRefIterator(List spiList) {
this.list = new ArrayList();
for (FsVolumeSpi volume : spiList) {
  try {
this.list.add(volume.obtainReference());
  } catch (ClosedChannelException e) {
LOG.info("Can't obtain a reference to {} because it is closed.",
volume.getBasePath());
  }
}
  }

  @Override
  public boolean hasNext() {
return (idx < list.size());
  }

  @Override
  public FsVolumeRef next() {
int i = idx++;
return list.get(i);
  }

  @Override
  public void remove() {
throw UnsupportedOperationException();
  }

  @Override
  public void close() throws IOException {
for (FsVolumeRef ref : list) {
  ref.close();
}
list.clear();
  }
}
{code}

Then we can get rid of {{getVolumeRefs}} and {{getVolumeRef}}.  Since the 
{{Iterator}} implements {{java.io.Closeable}}, findbugs will remind us that we 
need to close it (and free the refs) in any function we use it in.

> Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
> -
>
> Key: HDFS-7758
> URL: https://issues.apache.org/jira/browse/HDFS-7758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch
>
>
> HDFS-7496 introduced reference-counting  the volume instances being used to 
> prevent race condition when hot swapping a volume.
> However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
> without increasing its reference count. In this JIRA, we retire the 
> {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
> and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
> of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351047#comment-14351047
 ] 

Hadoop QA commented on HDFS-7857:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703135/HDFS-7857.001.patch
  against trunk revision d1abc5d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9784//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9784//console

This message is automatically generated.

> Incomplete information in WARN message caused user confusion
> 
>
> Key: HDFS-7857
> URL: https://issues.apache.org/jira/browse/HDFS-7857
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: supportability
> Attachments: HDFS-7857.001.patch
>
>
> Lots of the following messages appeared in NN log:
> {quote}
> 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: 
> Auth failed for :39838:null (DIGEST-MD5: IO error acquiring 
> password)
> 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> ..
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> :39843:null (DIGEST-MD5: IO error acquiring password)
> 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> {quote}
> The real reason of failure is the second message about StandbyException,
> However, the first message is confusing because it talks about "DIGEST-MD5: 
> IO error acquiring password".
> Filing this jira to modify the first message to have more comprehensive 
> information that can be obtained from {{getCauseForInvalidToken(e)}}.
> {code}
>try {
>   saslResponse = processSaslMessage(saslMessage);
> } catch (IOException e) {
>   rpcMetrics.incrAuthenticationFailures();
>   // attempting user could be null
>   AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
>   + attemptingUser + " (" + e.getLocalizedMessage() + ")");
>   throw (IOException) getCauseForInvalidToken(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351045#comment-14351045
 ] 

Hudson commented on HDFS-7818:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7275 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7275/])
HDFS-7818. OffsetParam should return the default value instead of throwing NPE 
when the value is unspecified. Contributed by Eric Payne. (wheat9: rev 
c79710302ee51e1a9ee17dadb161c69bb3aba5c9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestParameterParser.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/OffsetParam.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> OffsetParam should return the default value instead of throwing NPE when the 
> value is unspecified
> -
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified

2015-03-06 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7818:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~eepayne] for reporting 
and fixing the issue.

> OffsetParam should return the default value instead of throwing NPE when the 
> value is unspecified
> -
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7818) OffsetParam should return the default value instead of throwing NPE when the value is unspecified

2015-03-06 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7818:
-
Summary: OffsetParam should return the default value instead of throwing 
NPE when the value is unspecified  (was: DataNode throws NPE if the WebHdfs URL 
does not contain the offset parameter)

> OffsetParam should return the default value instead of throwing NPE when the 
> value is unspecified
> -
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351024#comment-14351024
 ] 

Haohui Mai commented on HDFS-7818:
--

+1. I'm committing this.

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-06 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351020#comment-14351020
 ] 

Zhe Zhang commented on HDFS-7853:
-

Thanks for the fix Jing! The PoC test now works stably. 

> Erasure coding: extend LocatedBlocks to support reading from striped files
> --
>
> Key: HDFS-7853
> URL: https://issues.apache.org/jira/browse/HDFS-7853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Jing Zhao
> Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch
>
>
> We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
> with striping layout (possibly an extra list specifying the index of each 
> location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS

2015-03-06 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7285:

Attachment: HDFS-7285-initial-PoC.patch

This is the patch from trunk that was used in the PoC test. It demonstrates the 
changes we have made to support basic I/O in striping layout.

> Erasure Coding Support inside HDFS
> --
>
> Key: HDFS-7285
> URL: https://issues.apache.org/jira/browse/HDFS-7285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Weihua Jiang
>Assignee: Zhe Zhang
> Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, 
> HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
> HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, 
> fsimage-analysis-20150105.pdf
>
>
> Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
> of data reliability, comparing to the existing HDFS 3-replica approach. For 
> example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
> with storage overhead only being 40%. This makes EC a quite attractive 
> alternative for big data storage, particularly for cold data. 
> Facebook had a related open source project called HDFS-RAID. It used to be 
> one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
> for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
> on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
> cold files that are intended not to be appended anymore; 3) the pure Java EC 
> coding implementation is extremely slow in practical use. Due to these, it 
> might not be a good idea to just bring HDFS-RAID back.
> We (Intel and Cloudera) are working on a design to build EC into HDFS that 
> gets rid of any external dependencies, makes it self-contained and 
> independently maintained. This design lays the EC feature on the storage type 
> support and considers compatible with existing HDFS features like caching, 
> snapshot, encryption, high availability and etc. This design will also 
> support different EC coding schemes, implementations and policies for 
> different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
> ISA-L library), an implementation can greatly improve the performance of EC 
> encoding/decoding and makes the EC solution even more attractive. We will 
> post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351008#comment-14351008
 ] 

Hadoop QA commented on HDFS-7818:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703065/HDFS-7818.v5.txt
  against trunk revision 95bfd08.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestFileTruncate

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9781//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9781//console

This message is automatically generated.

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350995#comment-14350995
 ] 

Haohui Mai commented on HDFS-6200:
--

Here is the list of dependency when I run {{mvn dependency:tree}} in 
{{hadoop-hdfs}}:

{noformat}
$ mvn dependency:tree|grep -v ":test"
...
[INFO] --- maven-dependency-plugin:2.2:tree (default-cli) @ hadoop-hdfs ---
[INFO] org.apache.hadoop:hadoop-hdfs:jar:3.0.0-SNAPSHOT
[INFO] +- org.apache.hadoop:hadoop-annotations:jar:3.0.0-SNAPSHOT:provided
[INFO] |  \- jdk.tools:jdk.tools:jar:1.8:system
[INFO] +- org.apache.hadoop:hadoop-auth:jar:3.0.0-SNAPSHOT:provided
[INFO] |  +- org.slf4j:slf4j-api:jar:1.7.10:provided
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.2.5:provided
[INFO] |  |  \- org.apache.httpcomponents:httpcore:jar:4.2.5:provided (version 
managed from 4.2.4)
[INFO] |  +- 
org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided
[INFO] |  |  +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:provided
[INFO] |  |  +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:provided
[INFO] |  |  \- org.apache.directory.api:api-util:jar:1.0.0-M20:provided
[INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided
[INFO] |  \- org.apache.curator:curator-framework:jar:2.7.1:provided
[INFO] +- org.apache.hadoop:hadoop-common:jar:3.0.0-SNAPSHOT:provided
[INFO] |  +- org.apache.commons:commons-math3:jar:3.1.1:provided
[INFO] |  +- commons-httpclient:commons-httpclient:jar:3.1:provided
[INFO] |  +- commons-net:commons-net:jar:3.1:provided
[INFO] |  +- commons-collections:commons-collections:jar:3.2.1:provided
[INFO] |  +- javax.servlet.jsp:jsp-api:jar:2.1:provided
[INFO] |  +- com.sun.jersey:jersey-json:jar:1.9:provided
[INFO] |  |  +- org.codehaus.jettison:jettison:jar:1.1:provided
[INFO] |  |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided
[INFO] |  |  |  \- javax.xml.bind:jaxb-api:jar:2.2.2:provided
[INFO] |  |  | +- javax.xml.stream:stax-api:jar:1.0-2:provided
[INFO] |  |  | \- javax.activation:activation:jar:1.1:provided
[INFO] |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:provided (version 
managed from 1.8.3)
[INFO] |  |  \- org.codehaus.jackson:jackson-xc:jar:1.9.13:provided (version 
managed from 1.8.3)
[INFO] |  +- net.java.dev.jets3t:jets3t:jar:0.9.0:provided
[INFO] |  |  \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:provided
[INFO] |  +- commons-configuration:commons-configuration:jar:1.6:provided
[INFO] |  |  +- commons-digester:commons-digester:jar:1.8:provided
[INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:provided
[INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided
[INFO] |  +- org.apache.avro:avro:jar:1.7.4:provided
[INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided
[INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.0.4.1:provided
[INFO] |  +- com.google.code.gson:gson:jar:2.2.4:provided
[INFO] |  +- com.jcraft:jsch:jar:0.1.42:provided
[INFO] |  +- org.apache.curator:curator-client:jar:2.7.1:provided
[INFO] |  +- org.apache.curator:curator-recipes:jar:2.7.1:provided
[INFO] |  \- org.apache.commons:commons-compress:jar:1.4.1:provided
[INFO] | \- org.tukaani:xz:jar:1.0:provided
[INFO] +- com.google.guava:guava:jar:11.0.2:compile
[INFO] |  \- com.google.code.findbugs:jsr305:jar:3.0.0:compile
[INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile
[INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
[INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile
[INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile
[INFO] |  \- asm:asm:jar:3.2:compile (version managed from 3.1)
[INFO] +- commons-cli:commons-cli:jar:1.2:compile
[INFO] +- commons-codec:commons-codec:jar:1.4:compile
[INFO] +- commons-io:commons-io:jar:2.4:compile
[INFO] +- commons-lang:commons-lang:jar:2.6:compile
[INFO] +- commons-logging:commons-logging:jar:1.1.3:compile
[INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile
[INFO] +- log4j:log4j:jar:1.2.17:compile
[INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
[INFO] +- javax.servlet:servlet-api:jar:2.5:compile
[INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided
[INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile
[INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
[INFO] +- xmlenc:xmlenc:jar:0.52:compile
[INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile
[INFO] +- xerces:xercesImpl:jar:2.9.1:compile
[INFO] |  \- xml-apis:xml-apis:jar:1.3.04:compile
[INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile
[INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
{noformat}

As I mentioned earlier I plan to keep the dependency of {{hadoop-common}} / 
{{hadoop-auth}} for the first phase, which would allow us to get rid of the 
following dependency in the client jar:

{noformat}
[INFO] +- com.google.guava:guava:jar:11.0.2:compile
[INFO] |  \- com.googl

[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7844:
---
Attachment: HDFS-7844-scl.002.patch

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6695) Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to implement read timeouts

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350939#comment-14350939
 ] 

Colin Patrick McCabe commented on HDFS-6695:


And sending an {{Interrupt}} to a thread that is reading using blocking I/O is 
"faked" by closing the FD.

> Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to 
> implement read timeouts
> --
>
> Key: HDFS-6695
> URL: https://issues.apache.org/jira/browse/HDFS-6695
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>
> In BlockReaderLocal, the "read" system call could block for a long time if 
> the disk drive is having problems, or there is a huge amount of I/O 
> contention.  This might cause poor latency performance.
> In the remote block readers, we have implemented a read timeout, but we don't 
> have one for the local block reader, since {{FileChannel#read}} doesn't 
> support this.  
> Once we move to JDK 7, we should investigate the {{java.nio.file}} 
> nonblocking file I/O package to see if it could be used to implement read 
> timeouts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350936#comment-14350936
 ] 

Colin Patrick McCabe commented on HDFS-6658:


[~clamb] and I have been discussing how to do block reports without 
backreferences.  If you have a 64-bit epoch number per datanode, you can bump 
that on each FBR.  Then, you can simply ignore block entries that are too old 
when you are accessing them.  In that case, you don't need to remove all stale 
blocks during an FBR.

The downside of this approach is that the memory for the old entries will 
linger for a while longer than it would have otherwise.  But if the memory 
consumption per entry is lower, it's probably still a win.  It's pretty rare 
for a large number of blocks to go away without being mentioned in incremental 
block reports (IBRs).  In the case where all IBRs are being received normally, 
of course, you have no additional memory overhead at all since you delete 
entries as soon as you get the incremental block removal notification.  And of 
course with the epoch-based approach, you avoid updating the three linked list 
entries each time you touch a block in the FBR.  This should give much better 
cache locality (the linked list has basically no cache locality at all... we're 
hammering main memory pretty much all the time right now).

This would probably be coupled with some kind of background scanner thread that 
removed stale blockinfo instances from the hash table.

> Namenode memory optimization - Block replicas list 
> ---
>
> Key: HDFS-6658
> URL: https://issues.apache.org/jira/browse/HDFS-6658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Amir Langer
>Assignee: Daryn Sharp
> Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
> redesign.pdf, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas 
> list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7857) Incomplete information in WARN message caused user confusion

2015-03-06 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350928#comment-14350928
 ] 

Yongjun Zhang commented on HDFS-7857:
-

Hi [~jingzhao],

I submitted patch 001, would you please help taking a look when convenient? 
thanks a lot.


> Incomplete information in WARN message caused user confusion
> 
>
> Key: HDFS-7857
> URL: https://issues.apache.org/jira/browse/HDFS-7857
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: supportability
> Attachments: HDFS-7857.001.patch
>
>
> Lots of the following messages appeared in NN log:
> {quote}
> 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: 
> Auth failed for :39838:null (DIGEST-MD5: IO error acquiring 
> password)
> 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> ..
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> :39843:null (DIGEST-MD5: IO error acquiring password)
> 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> {quote}
> The real reason of failure is the second message about StandbyException,
> However, the first message is confusing because it talks about "DIGEST-MD5: 
> IO error acquiring password".
> Filing this jira to modify the first message to have more comprehensive 
> information that can be obtained from {{getCauseForInvalidToken(e)}}.
> {code}
>try {
>   saslResponse = processSaslMessage(saslMessage);
> } catch (IOException e) {
>   rpcMetrics.incrAuthenticationFailures();
>   // attempting user could be null
>   AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
>   + attemptingUser + " (" + e.getLocalizedMessage() + ")");
>   throw (IOException) getCauseForInvalidToken(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7857) Incomplete information in WARN message caused user confusion

2015-03-06 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7857:

Status: Patch Available  (was: Open)

> Incomplete information in WARN message caused user confusion
> 
>
> Key: HDFS-7857
> URL: https://issues.apache.org/jira/browse/HDFS-7857
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: supportability
> Attachments: HDFS-7857.001.patch
>
>
> Lots of the following messages appeared in NN log:
> {quote}
> 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: 
> Auth failed for :39838:null (DIGEST-MD5: IO error acquiring 
> password)
> 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> ..
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> :39843:null (DIGEST-MD5: IO error acquiring password)
> 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> {quote}
> The real reason of failure is the second message about StandbyException,
> However, the first message is confusing because it talks about "DIGEST-MD5: 
> IO error acquiring password".
> Filing this jira to modify the first message to have more comprehensive 
> information that can be obtained from {{getCauseForInvalidToken(e)}}.
> {code}
>try {
>   saslResponse = processSaslMessage(saslMessage);
> } catch (IOException e) {
>   rpcMetrics.incrAuthenticationFailures();
>   // attempting user could be null
>   AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
>   + attemptingUser + " (" + e.getLocalizedMessage() + ")");
>   throw (IOException) getCauseForInvalidToken(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7857) Incomplete information in WARN message caused user confusion

2015-03-06 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7857:

Attachment: HDFS-7857.001.patch

> Incomplete information in WARN message caused user confusion
> 
>
> Key: HDFS-7857
> URL: https://issues.apache.org/jira/browse/HDFS-7857
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: supportability
> Attachments: HDFS-7857.001.patch
>
>
> Lots of the following messages appeared in NN log:
> {quote}
> 2014-12-10 12:18:15,728 WARN SecurityLogger.org.apache.hadoop.ipc.Server: 
> Auth failed for :39838:null (DIGEST-MD5: IO error acquiring 
> password)
> 2014-12-10 12:18:15,728 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> ..
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> :39843:null (DIGEST-MD5: IO error acquiring password)
> 2014-12-10 12:18:15,790 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client  threw exception 
> [org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
> supported in state standby]
> {quote}
> The real reason of failure is the second message about StandbyException,
> However, the first message is confusing because it talks about "DIGEST-MD5: 
> IO error acquiring password".
> Filing this jira to modify the first message to have more comprehensive 
> information that can be obtained from {{getCauseForInvalidToken(e)}}.
> {code}
>try {
>   saslResponse = processSaslMessage(saslMessage);
> } catch (IOException e) {
>   rpcMetrics.incrAuthenticationFailures();
>   // attempting user could be null
>   AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
>   + attemptingUser + " (" + e.getLocalizedMessage() + ")");
>   throw (IOException) getCauseForInvalidToken(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350878#comment-14350878
 ] 

Colin Patrick McCabe commented on HDFS-7836:


I don't think the block map size is really that easy to get at right now.  
Taking a heap dump on a big NN can take minutes... it's not something most 
sysadmins will let you do.  And the analysis is difficult... a lot of common 
heap analysis tools require tons of memory.

Anyway, we should probably add a JMX counter for the size(s) of the block map 
hash tables, and the number of entries, for tracking purposes.

> BlockManager Scalability Improvements
> -
>
> Key: HDFS-7836
> URL: https://issues.apache.org/jira/browse/HDFS-7836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: BlockManagerScalabilityImprovementsDesign.pdf
>
>
> Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350869#comment-14350869
 ] 

Colin Patrick McCabe commented on HDFS-6450:


I think it would be possible to support hedged non-positional reads in 
{{BlockReaderLocal}}, but difficult.  First we would have to stop re-using the 
same FD for all instances of a BlockReaderLocal that were reading the same 
replica.  Perhaps we could use dup to create a new FD per blockreader without 
doing multiple opens.  Then we could close the blockreader FD if the local read 
were being slow.

I think it's much easier to just implement hedged non-positional reads in the 
erasure coding-specific subclass of DFSInputStream.

I also think we may want to create a base class for DFSInputStream that both 
the raid and the non-raid code path inherit from.  Inheriting from the non-raid 
code path is weird because there is a lot of stuff that is not relevant.

> Support non-positional hedged reads in HDFS
> ---
>
> Key: HDFS-6450
> URL: https://issues.apache.org/jira/browse/HDFS-6450
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Liang Xie
> Attachments: HDFS-6450-like-pread.txt
>
>
> HDFS-5776 added support for hedged positional reads.  We should also support 
> hedged non-position reads (aka regular reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-06 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350828#comment-14350828
 ] 

Colin Patrick McCabe commented on HDFS-7844:


bq. Open addressing (probing) is used for Hash table in the patch. Now the load 
factor is 0.5, not sure whether we can still get similar performance if we 
choose a little bigger value (say 0.7, certainly it should be less than 1). 
Bigger loadfactor will increase the collision, but will save lots of memory. I 
don't have the performance data, but I see other hash table implementations 
using open addressing choose 0.7 ~ 0.75 as loadfactor. 

According to wikipedia (see http://en.wikipedia.org/wiki/Hash_table ), "open 
addressing" is where you store the records in the hash table itself.  This is 
different than probing, which is when you don't have a linked list per slot, 
but simply find another slot when collisions occur.  Open addressing does 
require probing, but probing does not require open addressing.

This hash table uses probing, but open addressing is optional.  In the test 
code I wrote, open addressing is not used (entries are not stored in the hash 
table itself... only pointers to entries are stored).  Partly this is because I 
can make the hash table larger that way.  In the block manager, we should not 
use open addressing, because BlockInfo structures are going to have variable 
size due to the variable replication factors.

I think you are correct, though, that we could use a higher load factor than 
0.5.  How well it will work will depend on a few things.  One very important 
thing is the quality of the hash function.  We need good dispersion to avoid 
clustering and non-constant behavior.

bq. How about write it as a configuration and default value is 0.5? Users don't 
need to change the default value it if they have big memory, but if the memory 
is limit?

Good idea

bq. in ProbingHashSet#getInternal ... We should call return null \[when slot == 
originalSlot\]. (Actually we will never reach there since it's at most half 
full)

Fixed

bq. ProbingHashSet#maintainCompactness...

Yeah.  I looked at this again and it was broken.  I think what we should do 
instead is just call {{putInternal}} on each element with {{overwrite = 
false}}.  Then if we find a key equal to the current one, we know that the 
element is already in the right slot.

bq. Currently even we use slf4j, but in someplace we still do something like 
Long.toHexString(addr) and it will affect little performance. Can we check the 
log level in those places?

I hate to put all those if statements in, but I think you're probably right.  I 
also fixed one or two cases where I wasn't calling {{Long.toHexString}} on an 
address.  I wish slf4j supported formatting strings.

bq. Unnessary import in ProbingHashSet, MemoryManager

Fixed

bq. 6. typos.

Fixed

> Create an off-heap hash table implementation
> 
>
> Key: HDFS-7844
> URL: https://issues.apache.org/jira/browse/HDFS-7844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7836
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7844-scl.001.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7875) Improve log message when wrong value configured for dfs.datanode.failed.volumes.tolerated

2015-03-06 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350793#comment-14350793
 ] 

Allen Wittenauer commented on HDFS-7875:


Let's put a space in between the . and Value.  There is also extraneous space 
at the end of that line.

> Improve log message when wrong value configured for 
> dfs.datanode.failed.volumes.tolerated 
> --
>
> Key: HDFS-7875
> URL: https://issues.apache.org/jira/browse/HDFS-7875
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: nijel
>Assignee: nijel
>Priority: Trivial
> Attachments: 0001-HDFS-7875.patch, 0002-HDFS-7875.patch
>
>
> By mistake i configured dfs.datanode.failed.volumes.tolerated equal to the 
> number of volume configured. Got stuck for some time in debugging since the 
> log message didn't give much details.
> The log message can be more detail. Added a patch with change in message.
> Please have a look



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350786#comment-14350786
 ] 

Alejandro Abdelnur commented on HDFS-6200:
--

Haohui,

Could you please list the actual set of dependencies the hdfs-client will 
carry? 

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350783#comment-14350783
 ] 

Haohui Mai edited comment on HDFS-6200 at 3/6/15 7:41 PM:
--

Thanks tucu. Just to clarify -- I'm not trashing the classloader solution, I 
agree that it has its own values on yarn/mr side. I don't see them as competing 
solutions, they provide values in different use cases. I think we don't need to 
mix the two issues.


was (Author: wheat9):
Thanks touch. Just to clarify -- I'm not trashing the classloader solution, I 
agree that it has its own values on yarn/mr side. I don't see them as competing 
solutions, they provide values in different use cases. I think we don't need to 
mix the two issues.

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350783#comment-14350783
 ] 

Haohui Mai commented on HDFS-6200:
--

Thanks touch. Just to clarify -- I'm not trashing the classloader solution, I 
agree that it has its own values on yarn/mr side. I don't see them as competing 
solutions, they provide values in different use cases. I think we don't need to 
mix the two issues.

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350774#comment-14350774
 ] 

Alejandro Abdelnur commented on HDFS-6200:
--

Haohui, 

Doing what hadoop-client wont solve the problems you want to tackle, it will 
just remove the JARs used on the HDFS server side only. If you just care about 
those server side dependencies, hadoop-client should be enough and you could 
exclude YARN/MR artifacts in your dependency.

If you want take care of guava, commons-*, etc, etc, you'll need to classloader 
magic for the filesystem impls, and this should be done in common where the 
Hadoop FileSystem API lives so all Hadoop FileSystem implementations get this 
kind of isolation.



> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7893) Update the POM to create a separate hdfs-client jar

2015-03-06 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350759#comment-14350759
 ] 

Jing Zhao commented on HDFS-7893:
-

The patch looks good to me. Some comments:
# maybe we should call it "Apache Hadoop HDFS Client"?
{code}
+  Apache Hadoop HDFS
+  Apache Hadoop HDFS
{code}
# we can also add this dependency to hadoop-hdfs-nfs.

> Update the POM to create a separate hdfs-client jar
> ---
>
> Key: HDFS-7893
> URL: https://issues.apache.org/jira/browse/HDFS-7893
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-7893.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350769#comment-14350769
 ] 

Hadoop QA commented on HDFS-7435:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702905/HDFS-7435.patch
  against trunk revision 95bfd08.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer
  org.apache.hadoop.hdfs.TestSetrepIncreasing
  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.hdfs.TestDecommission
  org.apache.hadoop.hdfs.server.balancer.TestBalancer
  org.apache.hadoop.hdfs.server.datanode.TestSimulatedFSDataset

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9776//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//console

This message is automatically generated.

> PB encoding of block reports is very inefficient
> 
>
> Key: HDFS-7435
> URL: https://issues.apache.org/jira/browse/HDFS-7435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, 
> HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch, 
> HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch
>
>
> Block reports are encoded as a PB repeating long.  Repeating fields use an 
> {{ArrayList}} with default capacity of 10.  A block report containing tens or 
> hundreds of thousand of longs (3 for each replica) is extremely expensive 
> since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
> fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350765#comment-14350765
 ] 

Haohui Mai commented on HDFS-6200:
--

bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a 
relationship that we'll have to adjust in the future if we e.g. decide that 
shading the third-party dependencies of hadoop-hdfs-client is the way to go.

Don't you agree we need a client jar?

I see you point. This jira, however, is about creating the client jar. 
Everything below the client jar is implementation detail. I don't think it need 
to be mixed with this jira. 

bq. Personally, I think having things stay where they are and using maven to 
build the client artifact will be the easiest to maintain

I don't agree. We did that for {{hadoop-client}}, which is available today. 
You're more than welcome to contribute and to clean things up. We've been hit 
really hard on resolving dependency conflicts in Oozie (which uses tomcat's 
classloader), Ranger (depends on different version of jersey-server), Spark 
(has a conflicting version of asm). We need a client jar whose dependency can 
be carefully and explicitly managed.

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350765#comment-14350765
 ] 

Haohui Mai edited comment on HDFS-6200 at 3/6/15 7:27 PM:
--

bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a 
relationship that we'll have to adjust in the future if we e.g. decide that 
shading the third-party dependencies of hadoop-hdfs-client is the way to go.

Don't you agree we need a client jar?

I see you point. This jira, however, is about creating the client jar. 
Everything below the client jar is implementation detail. I don't think it need 
to be mixed with this jira. 

bq. Personally, I think having things stay where they are and using maven to 
build the client artifact will be the easiest to maintain

I don't agree. We did that for {{hadoop-client}}, which is available today. 
You're more than welcome to contribute and to clean things up. We've been hit 
really hard on resolving dependency conflicts in Oozie (which uses tomcat's 
classloader), Ranger (depends on different version of jersey-server), Spark 
(has a conflicting version of asm). A clean solution to fix all the problems is 
appreciated.


was (Author: wheat9):
bq. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a 
relationship that we'll have to adjust in the future if we e.g. decide that 
shading the third-party dependencies of hadoop-hdfs-client is the way to go.

Don't you agree we need a client jar?

I see you point. This jira, however, is about creating the client jar. 
Everything below the client jar is implementation detail. I don't think it need 
to be mixed with this jira. 

bq. Personally, I think having things stay where they are and using maven to 
build the client artifact will be the easiest to maintain

I don't agree. We did that for {{hadoop-client}}, which is available today. 
You're more than welcome to contribute and to clean things up. We've been hit 
really hard on resolving dependency conflicts in Oozie (which uses tomcat's 
classloader), Ranger (depends on different version of jersey-server), Spark 
(has a conflicting version of asm). We need a client jar whose dependency can 
be carefully and explicitly managed.

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7854) Separate class DataStreamer out of DFSOutputStream

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350764#comment-14350764
 ] 

Hadoop QA commented on HDFS-7854:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703032/HDFS-7854-002.patch
  against trunk revision 24db081.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9783//console

This message is automatically generated.

> Separate class DataStreamer out of DFSOutputStream
> --
>
> Key: HDFS-7854
> URL: https://issues.apache.org/jira/browse/HDFS-7854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-7854-001.patch, HDFS-7854-002.patch
>
>
> This sub task separate DataStreamer from DFSOutputStream. New DataStreamer 
> will accept packets and write them to remote datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350760#comment-14350760
 ] 

Hadoop QA commented on HDFS-7261:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703042/HDFS-7261-001.patch
  against trunk revision 24db081.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9782//console

This message is automatically generated.

> storageMap is accessed without synchronization in 
> DatanodeDescriptor#updateHeartbeatState()
> ---
>
> Key: HDFS-7261
> URL: https://issues.apache.org/jira/browse/HDFS-7261
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-7261-001.patch, HDFS-7261.patch
>
>
> Here is the code:
> {code}
>   failedStorageInfos = new HashSet(
>   storageMap.values());
> {code}
> In other places, the lock on "DatanodeDescriptor.storageMap" is held:
> {code}
> synchronized (storageMap) {
>   final Collection storages = storageMap.values();
>   return storages.toArray(new DatanodeStorageInfo[storages.size()]);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350732#comment-14350732
 ] 

Hudson commented on HDFS-7885:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7271 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7271/])
HDFS-7885. Datanode should not trust the generation stamp provided by client. 
Contributed by Tsz Wo Nicholas Sze. (jing9: rev 
24db0812be64e83a48ade01fc1eaaeaedad4dec0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Datanode should not trust the generation stamp provided by client
> -
>
> Key: HDFS-7885
> URL: https://issues.apache.org/jira/browse/HDFS-7885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: vitthal (Suhas) Gogate
>Assignee: Tsz Wo Nicholas Sze
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: h7885_20150305.patch, h7885_20150306.patch
>
>
> Datanode should not trust the generation stamp provided by client, since it 
> is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7894) Rolling upgrade readiness is not updated in jmx until query command is issued.

2015-03-06 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350728#comment-14350728
 ] 

Kihwal Lee commented on HDFS-7894:
--

It won't work because of {{checkSuperuserPrivilege()}} and {{checkOperation()}}.
What do you think about something like following? I didn't try to compile or 
test this code. Adding a test case would be nice, if possible.
{code:java}
if (!isRollingUpgrade()) {
  return null; // this is the common case.
}
readLock();
// check again after acquiring the read lock.
RollingUpgradeInfo upgradeInfo = getRollingUpgradeInfo();
if (upgradeInfo == null) {
  return null;
}
try {
  boolean hasRollbackImage = this.getFSImage().hasRollbackFSImage();
  upgradeInfo.setCreatedRollbackImages(hasRollbackImage);
} finally {
  readUnlock();
}
return new RollingUpgradeInfo.Bean(upgradeInfo);
{code}

> Rolling upgrade readiness is not updated in jmx until query command is issued.
> --
>
> Key: HDFS-7894
> URL: https://issues.apache.org/jira/browse/HDFS-7894
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: HDFS-7894.patch
>
>
> When a hdfs rolling upgrade is started and a rollback image is 
> created/uploaded, the active NN does not update its {{rollingUpgradeInfo}} 
> until it receives a query command via RPC. This results in inconsistent info 
> being showing up in the web UI and its jmx page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-06 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350718#comment-14350718
 ] 

Konstantin Shvachko commented on HDFS-7886:
---

Forgot to mention

3. We should keep SEED = 100
4. Add replica printout to the assert in {{BlockListAsLongs}}, which helps 
debugging and does not effect runtime
{{"Must be under-construction replica: " + r;}}

> TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
> 
>
> Key: HDFS-7886
> URL: https://issues.apache.org/jira/browse/HDFS-7886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Plamen Jeliazkov
>Priority: Minor
> Attachments: HDFS-7886.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-06 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7885:

   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks for the fix, Nicholas! I've committed this to trunk and branch-2.

> Datanode should not trust the generation stamp provided by client
> -
>
> Key: HDFS-7885
> URL: https://issues.apache.org/jira/browse/HDFS-7885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: vitthal (Suhas) Gogate
>Assignee: Tsz Wo Nicholas Sze
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: h7885_20150305.patch, h7885_20150306.patch
>
>
> Datanode should not trust the generation stamp provided by client, since it 
> is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-06 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350712#comment-14350712
 ] 

Jing Zhao commented on HDFS-7885:
-

The latest patch looks good to me. The failed tests should be unrelated. +1.

I will commit it shortly.

> Datanode should not trust the generation stamp provided by client
> -
>
> Key: HDFS-7885
> URL: https://issues.apache.org/jira/browse/HDFS-7885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: vitthal (Suhas) Gogate
>Assignee: Tsz Wo Nicholas Sze
>Priority: Critical
> Attachments: h7885_20150305.patch, h7885_20150306.patch
>
>
> Datanode should not trust the generation stamp provided by client, since it 
> is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350711#comment-14350711
 ] 

Sean Busbey commented on HDFS-6200:
---

As I mentioned earlier, the dependencies your client artifact brings with it is 
a defining part of the interface you are exposing downstream applications to. 
That means we need the ability to manipulate those dependencies, even if we're 
only going to do so at a later date. Placing hadoop-hdfs-client as a dependency 
of hadoop-hdfs sets up a relationship that we'll have to adjust in the future 
if we e.g. decide that shading the third-party dependencies of 
hadoop-hdfs-client is the way to go.

I only mention the internal artifact as an alternative if having DFSClient live 
in hadoop-hdfs is undesirable. Personally, I think having things stay where 
they are and using maven to build the client artifact will be the easiest to 
maintain. However, there might be other mitigating factors I'm not aware of 
that make breaking the code into a new module desirable.

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350684#comment-14350684
 ] 

Haohui Mai edited comment on HDFS-6200 at 3/6/15 6:45 PM:
--

bq. For one, we don't have to worry about what dependencies we bring with us in 
the internal case because by definition we're in control of both the client 
interface and the place it's being used.

bq. In the approach I'm suggesting the original code for the client would still 
live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. 
If that is unappealing for some reason, perhaps we should structure things with 
an internal client artifact. e.g.

What about (1) hiding implementation in local package when possible? (2) 
marking it as private class as what we did today when the previous option is 
unavailable?

I don't think it is the time to create yet another artifact right now. There 
are quite a bit of overheads associated with it. I'm yet to see this is 
justified. If it is indeed required we can do it after hdfs-client is separated 
out.




was (Author: wheat9):
bq. For one, we don't have to worry about what dependencies we bring with us in 
the internal case because by definition we're in control of both the client 
interface and the place it's being used.

bq. In the approach I'm suggesting the original code for the client would still 
live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. 
If that is unappealing for some reason, perhaps we should structure things with 
an internal client artifact. e.g.

What is the point of creating yet another internal jar if you can simply hide 
{{DFSClient}} in local package?



> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7896) HDFS Slow disk detection

2015-03-06 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350697#comment-14350697
 ] 

Chris Nauroth commented on HDFS-7896:
-

bq. Chris Nauroth recently added failed volume reporting via HDFS-7604. Ideally 
we can extend that reporting infrastructure.

Yes, I think that will work.  We can add slow disk information to the 
{{VolumeFailureSummaryProto}} message.  That will ride along in heartbeats, and 
we can add corresponding metrics and web UI fields.

> HDFS Slow disk detection
> 
>
> Key: HDFS-7896
> URL: https://issues.apache.org/jira/browse/HDFS-7896
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Arpit Agarwal
>
> HDFS should detect slow disks. To start with we can flag this information via 
> the NameNode web UI. Alternatively DNs can avoid using slow disks for writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-06 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350686#comment-14350686
 ] 

Konstantin Shvachko commented on HDFS-7886:
---

I traced the latest failure with Plamen's fix. It actually points to the next 
test case {{testTruncateWithDataNodesShutdownImmediately()}}.

1. I think we need to add {{checkBlockRecovery()}} after restarting the 
DataNodes, and check the file length before deleting. Otherwise 
{{testTruncateWithDataNodesShutdownImmediately()}} seems incomplete without 
checking anything.

I did that, but then {{testCopyOnTruncateWithDataNodesRestart()}} fails. The 
symptom is the same - the assert error, but I think there may be a race 
condition between block recovery, which starts after the first block report and 
the second block report, which is explicitly triggered in the test.
Yi, I don't think we need to trigger block reports as restarting node will send 
one immediately after restarting. Triggering causes the second block report.

2. I think we can fix the test by removing {{triggerBlockReports()}} after 
restarting DNs.

But we still need to investigate the potential race between block recovery and 
block reporting. In a different jira probably.
So I think Plamen's fix is right it just didn't cover all test cases.
I know it is time consuming, because you need to run it several times before it 
fails - the nature of randomized tests. By adding waits on expected conditions 
we make it more deterministic.

> TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
> 
>
> Key: HDFS-7886
> URL: https://issues.apache.org/jira/browse/HDFS-7886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Plamen Jeliazkov
>Priority: Minor
> Attachments: HDFS-7886.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350684#comment-14350684
 ] 

Haohui Mai commented on HDFS-6200:
--

bq. For one, we don't have to worry about what dependencies we bring with us in 
the internal case because by definition we're in control of both the client 
interface and the place it's being used.

bq. In the approach I'm suggesting the original code for the client would still 
live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. 
If that is unappealing for some reason, perhaps we should structure things with 
an internal client artifact. e.g.

What is the point of creating yet another internal jar if you can simply hide 
{{DFSClient}} in local package?



> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350651#comment-14350651
 ] 

Hadoop QA commented on HDFS-7885:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702968/h7885_20150306.patch
  against trunk revision 95bfd08.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestFileTruncate

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9775//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9775//console

This message is automatically generated.

> Datanode should not trust the generation stamp provided by client
> -
>
> Key: HDFS-7885
> URL: https://issues.apache.org/jira/browse/HDFS-7885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: vitthal (Suhas) Gogate
>Assignee: Tsz Wo Nicholas Sze
>Priority: Critical
> Attachments: h7885_20150305.patch, h7885_20150306.patch
>
>
> Datanode should not trust the generation stamp provided by client, since it 
> is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-7901) Fix findbug warning in org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()

2015-03-06 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved HDFS-7901.

Resolution: Duplicate

> Fix findbug warning in 
> org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
> ---
>
> Key: HDFS-7901
> URL: https://issues.apache.org/jira/browse/HDFS-7901
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> {noformat}
>   
> Bug type DM_NUMBER_CTOR (click for details) 
> In class org.apache.hadoop.hdfs.web.resources.OffsetParam
> In method org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
> Called method new Long(long)
> Should call Long.valueOf(long) instead
> At OffsetParam.java:[line 52]
> {noformat}
> https://builds.apache.org/job/PreCommit-HDFS-Build/9767//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7901) Fix findbug warning in org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()

2015-03-06 Thread Brahma Reddy Battula (JIRA)

Brahma Reddy Battula created HDFS-7901:
--

 Summary: Fix findbug warning in 
org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
 Key: HDFS-7901
 URL: https://issues.apache.org/jira/browse/HDFS-7901
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula


{noformat}

Bug type DM_NUMBER_CTOR (click for details) 
In class org.apache.hadoop.hdfs.web.resources.OffsetParam
In method org.apache.hadoop.hdfs.web.resources.OffsetParam.getOffset()
Called method new Long(long)
Should call Long.valueOf(long) instead
At OffsetParam.java:[line 52]
{noformat}

https://builds.apache.org/job/PreCommit-HDFS-Build/9767//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350491#comment-14350491
 ] 

Brahma Reddy Battula commented on HDFS-7818:


can I close HDFS-7901..? Please let me know,thanks..

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350465#comment-14350465
 ] 

Hudson commented on HDFS-7855:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2074 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2074/])
HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. 
(jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java


> Separate class Packet from DFSOutputStream
> --
>
> Key: HDFS-7855
> URL: https://issues.apache.org/jira/browse/HDFS-7855
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: dfsclient
>Reporter: Li Bo
>Assignee: Li Bo
> Fix For: 2.7.0
>
> Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
> HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
> HDFS-7855-006.patch, HDFS-7855-007.patch
>
>
> Class Packet is an inner class in DFSOutputStream and also used by 
> DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
> the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350453#comment-14350453
 ] 

Hadoop QA commented on HDFS-7818:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703065/HDFS-7818.v5.txt
  against trunk revision 95bfd08.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9780//console

This message is automatically generated.

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350444#comment-14350444
 ] 

Hadoop QA commented on HDFS-7433:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch
  against trunk revision 95bfd08.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9779//console

This message is automatically generated.

> Optimize performance of DatanodeManager's node map
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350430#comment-14350430
 ] 

Hadoop QA commented on HDFS-7433:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch
  against trunk revision 95bfd08.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9778//console

This message is automatically generated.

> Optimize performance of DatanodeManager's node map
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350427#comment-14350427
 ] 

Hudson commented on HDFS-7855:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #124 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/124/])
HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. 
(jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java


> Separate class Packet from DFSOutputStream
> --
>
> Key: HDFS-7855
> URL: https://issues.apache.org/jira/browse/HDFS-7855
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: dfsclient
>Reporter: Li Bo
>Assignee: Li Bo
> Fix For: 2.7.0
>
> Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
> HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
> HDFS-7855-006.patch, HDFS-7855-007.patch
>
>
> Class Packet is an inner class in DFSOutputStream and also used by 
> DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
> the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7818:
-
Attachment: HDFS-7818.v5.txt

Fixing findbugs warning and updating patch to v5.

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt, HDFS-7818.v5.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350412#comment-14350412
 ] 

Hudson commented on HDFS-7855:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2056 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2056/])
HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. 
(jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Separate class Packet from DFSOutputStream
> --
>
> Key: HDFS-7855
> URL: https://issues.apache.org/jira/browse/HDFS-7855
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: dfsclient
>Reporter: Li Bo
>Assignee: Li Bo
> Fix For: 2.7.0
>
> Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
> HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
> HDFS-7855-006.patch, HDFS-7855-007.patch
>
>
> Class Packet is an inner class in DFSOutputStream and also used by 
> DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
> the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350415#comment-14350415
 ] 

Hadoop QA commented on HDFS-7433:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch
  against trunk revision 95bfd08.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9777//console

This message is automatically generated.

> Optimize performance of DatanodeManager's node map
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350408#comment-14350408
 ] 

Hudson commented on HDFS-7855:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #115 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/115/])
HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. 
(jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java


> Separate class Packet from DFSOutputStream
> --
>
> Key: HDFS-7855
> URL: https://issues.apache.org/jira/browse/HDFS-7855
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: dfsclient
>Reporter: Li Bo
>Assignee: Li Bo
> Fix For: 2.7.0
>
> Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
> HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
> HDFS-7855-006.patch, HDFS-7855-007.patch
>
>
> Class Packet is an inner class in DFSOutputStream and also used by 
> DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
> the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6806) HDFS Rolling upgrade document should mention the versions available

2015-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350403#comment-14350403
 ] 

Hadoop QA commented on HDFS-6806:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703011/HDFS-6806.3.patch
  against trunk revision 95bfd08.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestCrcCorruption

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestAppendSnapshotTruncate

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9772//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9772//console

This message is automatically generated.

> HDFS Rolling upgrade document should mention the versions available
> ---
>
> Key: HDFS-6806
> URL: https://issues.apache.org/jira/browse/HDFS-6806
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.0
>Reporter: Akira AJISAKA
>Assignee: J.Andreina
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-6806.1.patch, HDFS-6806.2.patch, HDFS-6806.3.patch
>
>
> We should document that rolling upgrades do not support upgrades from ~2.3 to 
> 2.4+. It has been asked in the user ML many times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7818) DataNode throws NPE if the WebHdfs URL does not contain the offset parameter

2015-03-06 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7818:
-
Priority: Blocker  (was: Critical)
Target Version/s: 2.7.0

Marking as a blocker since this is a very common scenario when using webHDFS, 
and it hits the NPE every time. The only workaround is to use HDFS instead of 
the webHDFS interface, but that is not always an option when reading cross-colo 
or off grid.

> DataNode throws NPE if the WebHdfs URL does not contain the offset parameter
> 
>
> Key: HDFS-7818
> URL: https://issues.apache.org/jira/browse/HDFS-7818
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Blocker
> Attachments: HDFS-7818.v1.txt, HDFS-7818.v2.txt, HDFS-7818.v3.txt, 
> HDFS-7818.v4.txt
>
>
> This is a regression in 2.7 and later.
> {{hadoop fs -cat}} over webhdfs works, but {{hadoop fs -text}} does not:
> {code}
> $ hadoop fs -cat webhdfs://myhost.com/tmp/test.1
> ... output ...
> $ hadoop fs -text webhdfs://myhost.com/tmp/test.1
> text: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> null
>   at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:165)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:615)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:463)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:492)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-06 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350369#comment-14350369
 ] 

Sean Busbey commented on HDFS-6200:
---

The dependencies you bring with you are an integral part of the interface you 
define for downstream clients. While I agree that it can be a separate subtask, 
it has to be considered as part of how you structure the overall approach.

{quote}
Unfortunately the dependency is a real one – the webhdfs server on DN uses 
DFSClient to read data from HDFS.
{quote}

Our own internal use of client interfaces isn't the same thing as downstream 
application uses. For one, we don't have to worry about what dependencies we 
bring with us in the internal case because by definition we're in control of 
both the client interface and the place it's being used.

In the approach I'm suggesting the original code for the client would still 
live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. 
If that is unappealing for some reason, perhaps we should structure things with 
an internal client artifact. e.g.
{noformat}
hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal
hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal
{noformat}

> Create a separate jar for hdfs-client
> -
>
> Key: HDFS-6200
> URL: https://issues.apache.org/jira/browse/HDFS-6200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350305#comment-14350305
 ] 

Hudson commented on HDFS-7855:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #858 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/858/])
HDFS-7855. Separate class Packet from DFSOutputStream. Contributed by Li Bo. 
(jing9: rev 952640fa4cbdc23fe8781e5627c2e8eab565c535)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSPacket.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPacket.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java


> Separate class Packet from DFSOutputStream
> --
>
> Key: HDFS-7855
> URL: https://issues.apache.org/jira/browse/HDFS-7855
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: dfsclient
>Reporter: Li Bo
>Assignee: Li Bo
> Fix For: 2.7.0
>
> Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
> HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
> HDFS-7855-006.patch, HDFS-7855-007.patch
>
>
> Class Packet is an inner class in DFSOutputStream and also used by 
> DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
> the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 118 matches

Mail list logo