date:20140128

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885092#comment-13885092
 ] 

stack commented on HDFS-5776:
-

bq. what do you think ?

That looks good to me [~xieliang007] 

bq. ...making the pool size readonly, i can reupload a new patch.

We can add back the flexibility in a later issue -- i.e. being able to adjust 
pool size on the fly.  I suggest posting a patch where the pool size is read 
from the configuration and is read-only post construction.  It would address an 
above reviewers concern and I believe address all outstanding concerns.

Base your revision on v10 if you don't mind.






> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Liang Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885091#comment-13885091
 ] 

Liang Xie commented on HDFS-5776:
-

bq. I think a better way is to add this check in chooseDataNode: if 
chooseDataNode finds that this is for seeking the second DN (if ignored is not 
null), and it could not immediately/easily find a DN, the chooseDataNode should 
skip retrying and we may want to fall back to the normal read.
Yeh, sound reasonable. will look into it later once get chance.
P.S. i am taking a 8+ days long holiday(China Spring Festival) and probably can 
not reply or make patch timely, sorry.

Happy Holiday to all guys, thanks for looking at this JIRA !!!

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Liang Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885083#comment-13885083
 ] 

Liang Xie commented on HDFS-5776:
-

bq. we do not check threadpool in enableHedgedReads. This makes it possible 
that isHedgedReadsEnabled() returns true while hedged read is actually not 
enabled.
i can change to sth like those if you gys want:
{code}
 return allowHedgedReads && (HEDGED_READ_THREAD_POOL != null) && 
HEDGED_READ_THREAD_POOL.getMaximumPoolSize() > 0;
{code}
what do you think ?
bq. DFSClient#setThreadsNumForHedgedReads allows users to keep changing the 
size of the thread pool.
we definitely need the ability to modify the pool size on the fly, especially 
for HBase ops.
bq. Read the thread pool size configuration only when initializing the thread 
pool, and the size should be >0 and cannot be changed
Here is the same disagreement, if you guys all still insist on making the pool 
size readonly, i can reupload a new patch. Per my few previous operation 
experience, it's absolutely inconvenienced to an system ops/admin.

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885045#comment-13885045
 ] 

Hadoop QA commented on HDFS-5845:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625772/hdfs-5845-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5975//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5975//console

This message is automatically generated.

> SecondaryNameNode dies when checkpointing with cache pools
> --
>
> Key: HDFS-5845
> URL: https://issues.apache.org/jira/browse/HDFS-5845
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Blocker
>  Labels: caching
> Attachments: hdfs-5845-1.patch
>
>
> The SecondaryNameNode clears and reloads its FSNamesystem when doing 
> checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
> during this reload. This leads to an error like the following:
> {noformat}
> org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5709) Improve upgrade with existing files and directories named ".snapshot"

2014-01-28 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885036#comment-13885036
 ] 

Suresh Srinivas commented on HDFS-5709:
---

[~andrew.wang], that looks good.

The likelihood of collision in case of .snapshot.LV.UPGRADE_RENAMED is probably 
very low. When namenode fails to ugprade due to reserved name collision, it 
should print out all the list of reserved names in the file system along with 
the error need to do -upgrade with -renameReserved flag. That way users know to 
pass all the reserved names and their corresponding preferred name, if they 
choose to use key/value pairs.

> Improve upgrade with existing files and directories named ".snapshot"
> -
>
> Key: HDFS-5709
> URL: https://issues.apache.org/jira/browse/HDFS-5709
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: snapshots, upgrade
> Attachments: hdfs-5709-1.patch, hdfs-5709-2.patch, hdfs-5709-3.patch, 
> hdfs-5709-4.patch, hdfs-5709-5.patch
>
>
> Right now in trunk, upgrade fails messily if the old fsimage or edits refer 
> to a directory named ".snapshot". We should at least print a better error 
> message (which I believe was the original intention in HDFS-4666), and [~atm] 
> proposed automatically renaming these files and directories.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885032#comment-13885032
 ] 

Akira AJISAKA commented on HDFS-5844:
-

Thank you for committing, [~arpitagarwal]!

> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885027#comment-13885027
 ] 

Hadoop QA commented on HDFS-5746:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625767/HDFS-5746.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1546 javac 
compiler warnings (more than the trunk's current 1541 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
-14 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5972//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5972//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5972//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5972//console

This message is automatically generated.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch, HDFS-5746.004.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885013#comment-13885013
 ] 

Hudson commented on HDFS-5844:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5057 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5057/])
HDFS-5844. Fix broken link in WebHDFS.apt.vm (Contributed by Akira Ajisaka) 
(arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562357)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5844:


   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

+1 for the patch. Generated site and verified it fixes the link. I committed 
this to trunk, branch-2 and branch-2.3

Thanks for the contribution [~ajisakaa].

> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5153) Datanode should send block reports for each storage in a separate message

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885006#comment-13885006
 ] 

Hadoop QA commented on HDFS-5153:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625769/HDFS-5153.05.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5971//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5971//console

This message is automatically generated.

> Datanode should send block reports for each storage in a separate message
> -
>
> Key: HDFS-5153
> URL: https://issues.apache.org/jira/browse/HDFS-5153
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-5153.01.patch, HDFS-5153.03.patch, 
> HDFS-5153.03b.patch, HDFS-5153.04.patch, HDFS-5153.05.patch
>
>
> When the number of blocks on the DataNode grows large we start running into a 
> few issues:
> # Block reports take a long time to process on the NameNode. In testing we 
> have seen that a block report with 6 Million blocks takes close to one second 
> to process on the NameNode. The NameSystem write lock is held during this 
> time.
> # We start hitting the default protobuf message limit of 64MB somewhere 
> around 10 Million blocks. While we can increase the message size limit it 
> already takes over 7 seconds to serialize/unserialize a block report of this 
> size.
> HDFS-2832 has introduced the concept of a DataNode as a collection of 
> storages i.e. the NameNode is aware of all the volumes (storage directories) 
> attached to a given DataNode. This makes it easy to split block reports from 
> the DN by sending one report per storage directory to mitigate the above 
> problems.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5709) Improve upgrade with existing files and directories named ".snapshot"

2014-01-28 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884987#comment-13884987
 ] 

Andrew Wang commented on HDFS-5709:
---

I had a quick call with Suresh to hash this out, and we arrived at the 
following which should be suitable for everyone:

* Rather than a configuration option which can stick around forever, an 
additional command line flag (e.g. "-upgrade -renameReserved") is better. This 
way we worry about it once, and there are no lingering effects.
* We default to renaming reserved paths to a convention like 
{{.snapshot.LV.UPGRADE_RENAMED}}, but also allow users to pass key/value pairs 
on the command line, e.g. "-upgrade -renameReserved .snapshot=.user-snapshot". 
In either case, we should do our best to detect collisions, but it's hard with 
the edit log.
* It'd be good to do this for "/.reserved" too, which will help demonstrate 
that this is a generic solution.

I think this is an accurate summary, so I'll start revving the patch as per 
above. Please comment if something is still off.

> Improve upgrade with existing files and directories named ".snapshot"
> -
>
> Key: HDFS-5709
> URL: https://issues.apache.org/jira/browse/HDFS-5709
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: snapshots, upgrade
> Attachments: hdfs-5709-1.patch, hdfs-5709-2.patch, hdfs-5709-3.patch, 
> hdfs-5709-4.patch, hdfs-5709-5.patch
>
>
> Right now in trunk, upgrade fails messily if the old fsimage or edits refer 
> to a directory named ".snapshot". We should at least print a better error 
> message (which I believe was the original intention in HDFS-4666), and [~atm] 
> proposed automatically renaming these files and directories.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5318) Support read-only and read-write paths to shared replicas

2014-01-28 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884983#comment-13884983
 ] 

Arpit Agarwal commented on HDFS-5318:
-

I am +1 on this approach. I think it's fine to document the requirement around 
reporting non-finalized replicas.

Unless anyone else has objections I'll review the latest patch this week.

> Support read-only and read-write paths to shared replicas
> -
>
> Key: HDFS-5318
> URL: https://issues.apache.org/jira/browse/HDFS-5318
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Eric Sirianni
> Attachments: HDFS-5318.patch, HDFS-5318a-branch-2.patch, 
> HDFS-5318b-branch-2.patch, HDFS-5318c-branch-2.patch, hdfs-5318.pdf
>
>
> There are several use cases for using shared-storage for datanode block 
> storage in an HDFS environment (storing cold blocks on a NAS device, Amazon 
> S3, etc.).
> With shared-storage, there is a distinction between:
> # a distinct physical copy of a block
> # an access-path to that block via a datanode.  
> A single 'replication count' metric cannot accurately capture both aspects.  
> However, for most of the current uses of 'replication count' in the Namenode, 
> the "number of physical copies" aspect seems to be the appropriate semantic.
> I propose altering the replication counting algorithm in the Namenode to 
> accurately infer distinct physical copies in a shared storage environment.  
> With HDFS-5115, a {{StorageID}} is a UUID.  I propose associating some minor 
> additional semantics to the {{StorageID}} - namely that multiple datanodes 
> attaching to the same physical shared storage pool should report the same 
> {{StorageID}} for that pool.  A minor modification would be required in the 
> DataNode to enable the generation of {{StorageID}} s to be pluggable behind 
> the {{FsDatasetSpi}} interface.  
> With those semantics in place, the number of physical copies of a block in a 
> shared storage environment can be calculated as the number of _distinct_ 
> {{StorageID}} s associated with that block.
> Consider the following combinations for two {{(DataNode ID, Storage ID)}} 
> pairs {{(DN_A, S_A) (DN_B, S_B)}} for a given block B:
> * {{DN_A != DN_B && S_A != S_B}} - *different* access paths to *different* 
> physical replicas (i.e. the traditional HDFS case with local disks)
> ** → Block B has {{ReplicationCount == 2}}
> * {{DN_A != DN_B && S_A == S_B}} - *different* access paths to the *same* 
> physical replica (e.g. HDFS datanodes mounting the same NAS share)
> ** → Block B has {{ReplicationCount == 1}}
> For example, if block B has the following location tuples:
> * {{DN_1, STORAGE_A}}
> * {{DN_2, STORAGE_A}}
> * {{DN_3, STORAGE_B}}
> * {{DN_4, STORAGE_B}},
> the effect of this proposed change would be to calculate the replication 
> factor in the namenode as *2* instead of *4*.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-3828) Block Scanner rescans blocks too frequently

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884967#comment-13884967
 ] 

Hadoop QA commented on HDFS-3828:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12543965/hdfs-3828-3.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5974//console

This message is automatically generated.

> Block Scanner rescans blocks too frequently
> ---
>
> Key: HDFS-3828
> URL: https://issues.apache.org/jira/browse/HDFS-3828
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Fix For: 2.3.0
>
> Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs-3828-3.txt, 
> hdfs3828.txt
>
>
> {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from 
> {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}.  But cleanUp 
> unconditionally roll()s the verificationLogs, so after two iterations we have 
> lost the first iteration of block verification times.  As a result a cluster 
> with just one block repeatedly rescans it every 10 seconds:
> {noformat}
> 2012-08-16 15:59:57,884 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> 2012-08-16 16:00:07,904 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> 2012-08-16 16:00:17,925 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> {noformat}
> {quote}
> To fix this, we need to avoid roll()ing the logs multiple times per period.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-3907) Allow multiple users for local block readers

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884962#comment-13884962
 ] 

Hadoop QA commented on HDFS-3907:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12544410/hdfs-3907.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5973//console

This message is automatically generated.

> Allow multiple users for local block readers
> 
>
> Key: HDFS-3907
> URL: https://issues.apache.org/jira/browse/HDFS-3907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.3.0
>
> Attachments: hdfs-3907.txt
>
>
> The {{dfs.block.local-path-access.user}} config added in HDFS-2246 only 
> supports a single user, however as long as blocks are group readable by more 
> than one user the feature could be used by multiple users, to support this we 
> just need to allow both to be configured. In practice this allows us to also 
> support HBase where the client (RS) runs as the hbase system user and the DN 
> runs as hdfs system user. I think this should work secure as well since we're 
> not using impersonation in the HBase case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5845:
--

Labels: caching  (was: )
Status: Patch Available  (was: Open)

> SecondaryNameNode dies when checkpointing with cache pools
> --
>
> Key: HDFS-5845
> URL: https://issues.apache.org/jira/browse/HDFS-5845
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Blocker
>  Labels: caching
> Attachments: hdfs-5845-1.patch
>
>
> The SecondaryNameNode clears and reloads its FSNamesystem when doing 
> checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
> during this reload. This leads to an error like the following:
> {noformat}
> org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5845:
--

Attachment: hdfs-5845-1.patch

Patch attached. This was pretty simple, but requires taking the FSN writelock 
on the 2NN since we have a bunch of write lock asserts in CacheManager. I think 
this is okay since we already do this in the SbNN, but someone should weigh in 
if this isn't okay.

{{diff -w}} helps with reviewing the test change, since I needed to indent a 
test by one.

> SecondaryNameNode dies when checkpointing with cache pools
> --
>
> Key: HDFS-5845
> URL: https://issues.apache.org/jira/browse/HDFS-5845
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Blocker
>  Labels: caching
> Attachments: hdfs-5845-1.patch
>
>
> The SecondaryNameNode clears and reloads its FSNamesystem when doing 
> checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
> during this reload. This leads to an error like the following:
> {noformat}
> org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5318) Support read-only and read-write paths to shared replicas

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884935#comment-13884935
 ] 

Hadoop QA commented on HDFS-5318:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12624805/HDFS-5318c-branch-2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5970//console

This message is automatically generated.

> Support read-only and read-write paths to shared replicas
> -
>
> Key: HDFS-5318
> URL: https://issues.apache.org/jira/browse/HDFS-5318
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Eric Sirianni
> Attachments: HDFS-5318.patch, HDFS-5318a-branch-2.patch, 
> HDFS-5318b-branch-2.patch, HDFS-5318c-branch-2.patch, hdfs-5318.pdf
>
>
> There are several use cases for using shared-storage for datanode block 
> storage in an HDFS environment (storing cold blocks on a NAS device, Amazon 
> S3, etc.).
> With shared-storage, there is a distinction between:
> # a distinct physical copy of a block
> # an access-path to that block via a datanode.  
> A single 'replication count' metric cannot accurately capture both aspects.  
> However, for most of the current uses of 'replication count' in the Namenode, 
> the "number of physical copies" aspect seems to be the appropriate semantic.
> I propose altering the replication counting algorithm in the Namenode to 
> accurately infer distinct physical copies in a shared storage environment.  
> With HDFS-5115, a {{StorageID}} is a UUID.  I propose associating some minor 
> additional semantics to the {{StorageID}} - namely that multiple datanodes 
> attaching to the same physical shared storage pool should report the same 
> {{StorageID}} for that pool.  A minor modification would be required in the 
> DataNode to enable the generation of {{StorageID}} s to be pluggable behind 
> the {{FsDatasetSpi}} interface.  
> With those semantics in place, the number of physical copies of a block in a 
> shared storage environment can be calculated as the number of _distinct_ 
> {{StorageID}} s associated with that block.
> Consider the following combinations for two {{(DataNode ID, Storage ID)}} 
> pairs {{(DN_A, S_A) (DN_B, S_B)}} for a given block B:
> * {{DN_A != DN_B && S_A != S_B}} - *different* access paths to *different* 
> physical replicas (i.e. the traditional HDFS case with local disks)
> ** → Block B has {{ReplicationCount == 2}}
> * {{DN_A != DN_B && S_A == S_B}} - *different* access paths to the *same* 
> physical replica (e.g. HDFS datanodes mounting the same NAS share)
> ** → Block B has {{ReplicationCount == 1}}
> For example, if block B has the following location tuples:
> * {{DN_1, STORAGE_A}}
> * {{DN_2, STORAGE_A}}
> * {{DN_3, STORAGE_B}}
> * {{DN_4, STORAGE_B}},
> the effect of this proposed change would be to calculate the replication 
> factor in the namenode as *2* instead of *4*.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-5845:
--

Priority: Blocker  (was: Major)

> SecondaryNameNode dies when checkpointing with cache pools
> --
>
> Key: HDFS-5845
> URL: https://issues.apache.org/jira/browse/HDFS-5845
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Blocker
>
> The SecondaryNameNode clears and reloads its FSNamesystem when doing 
> checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
> during this reload. This leads to an error like the following:
> {noformat}
> org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884927#comment-13884927
 ] 

Suresh Srinivas commented on HDFS-5845:
---

[~andrew.wang], I am marking this as blocker for 2.3.0.

> SecondaryNameNode dies when checkpointing with cache pools
> --
>
> Key: HDFS-5845
> URL: https://issues.apache.org/jira/browse/HDFS-5845
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>
> The SecondaryNameNode clears and reloads its FSNamesystem when doing 
> checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
> during this reload. This leads to an error like the following:
> {noformat}
> org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5153) Datanode should send block reports for each storage in a separate message

2014-01-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5153:


Attachment: HDFS-5153.05.patch

Rebase patch.

> Datanode should send block reports for each storage in a separate message
> -
>
> Key: HDFS-5153
> URL: https://issues.apache.org/jira/browse/HDFS-5153
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-5153.01.patch, HDFS-5153.03.patch, 
> HDFS-5153.03b.patch, HDFS-5153.04.patch, HDFS-5153.05.patch
>
>
> When the number of blocks on the DataNode grows large we start running into a 
> few issues:
> # Block reports take a long time to process on the NameNode. In testing we 
> have seen that a block report with 6 Million blocks takes close to one second 
> to process on the NameNode. The NameSystem write lock is held during this 
> time.
> # We start hitting the default protobuf message limit of 64MB somewhere 
> around 10 Million blocks. While we can increase the message size limit it 
> already takes over 7 seconds to serialize/unserialize a block report of this 
> size.
> HDFS-2832 has introduced the concept of a DataNode as a collection of 
> storages i.e. the NameNode is aware of all the volumes (storage directories) 
> attached to a given DataNode. This makes it easy to split block reports from 
> the DN by sending one report per storage directory to mitigate the above 
> problems.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HDFS-5845) SecondaryNameNode dies when checkpointing with cache pools

2014-01-28 Thread Andrew Wang (JIRA)

Andrew Wang created HDFS-5845:
-

 Summary: SecondaryNameNode dies when checkpointing with cache pools
 Key: HDFS-5845
 URL: https://issues.apache.org/jira/browse/HDFS-5845
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang


The SecondaryNameNode clears and reloads its FSNamesystem when doing 
checkpointing. However, FSNamesystem#clear does not clear CacheManager state 
during this reload. This leads to an error like the following:

{noformat}
org.apache.hadoop.fs.InvalidRequestException: Cache pool pool1 already exists.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884919#comment-13884919
 ] 

Andrew Wang commented on HDFS-5746:
---

bq. It doesn't infinitely loop, because sendCallback always removes the fd from 
toRemove.
I missed this, good point. The verify I wanted was a second look at the code, 
no need for a test.

bq. I like the current terminology. "lockable" just sounds vague
Okay, I'm alright with "anchorable" for the flag. Can we change the name of the 
refcount field and methods though? "anchor" and "unanchor" do not sound not 
incremental operations to me, and the field being named "anchor" does not evoke 
a count.

bq, 
Yea, I was wondering since I didn't see a field and accessor for the slot 
index. I assume that'll be added at some point though.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch, HDFS-5746.004.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (HDFS-5153) Datanode should send block reports for each storage in a separate message

2014-01-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDFS-5153:
---

Assignee: Arpit Agarwal

> Datanode should send block reports for each storage in a separate message
> -
>
> Key: HDFS-5153
> URL: https://issues.apache.org/jira/browse/HDFS-5153
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-5153.01.patch, HDFS-5153.03.patch, 
> HDFS-5153.03b.patch, HDFS-5153.04.patch
>
>
> When the number of blocks on the DataNode grows large we start running into a 
> few issues:
> # Block reports take a long time to process on the NameNode. In testing we 
> have seen that a block report with 6 Million blocks takes close to one second 
> to process on the NameNode. The NameSystem write lock is held during this 
> time.
> # We start hitting the default protobuf message limit of 64MB somewhere 
> around 10 Million blocks. While we can increase the message size limit it 
> already takes over 7 seconds to serialize/unserialize a block report of this 
> size.
> HDFS-2832 has introduced the concept of a DataNode as a collection of 
> storages i.e. the NameNode is aware of all the volumes (storage directories) 
> attached to a given DataNode. This makes it easy to split block reports from 
> the DN by sending one report per storage directory to mitigate the above 
> problems.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884914#comment-13884914
 ] 

Andrew Wang commented on HDFS-5746:
---

Few more test-related comments:

* Tests for {{DSW#remove}} would be good, even when the race is fixed properly. 
If I'm right about the inf loop, a test would have caught it.
* TestSCSMS, testStartupShutdown seems like a strict subset of 
testAllocateSlots functionality.
* How about some prodding of a closed SCSMS too? Would also be good to test a 
couple of the other {{free()}} paths of SCSMS, since it can happen at close of 
the last slot, the SCSMS, and in allocateNextSlot too.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch, HDFS-5746.004.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5746:
---

Attachment: HDFS-5746.004.patch

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch, HDFS-5746.004.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884911#comment-13884911
 ] 

Colin Patrick McCabe commented on HDFS-5746:


bq. Can we verify the fix for racing sendCallback and toRemove? I think we need 
to check that the fd being removed is in entries before doing sendCallback. 
firstEntry also doesn't remove the entry from toRemove, so it looks like this 
inf loops. pollFirstEntry instead?

It doesn't infinitely loop, because sendCallback always removes the fd from 
toRemove.

I can't think of any practical way to test the scenario you outlined, with an 
event happening on {{sendCallback}} racing with the same fd added to 
{{toRemove}} .  Maybe a stress test would hit it.

bq. Maybe remove() should also return a boolean "success" value too, rather 
than just swallowing an unknown socket.

It's not needed because if we try to remove something that doesn't exist, we 
hit a precondition check.

bq. Should doc that we only support one Handler per fd, it overwrites on add.

added this comment

bq. Can add a Precondition check to make sure the lock is held in checkNotClosed

added

bq. Flag constants would be more readable as "1<<63" and "1<<62" rather than 15 
zeroes (I did verify though )

ok

bq. Comment in Slot constructor talks about incrementing a refcount, but that's 
no longer happening there.  No need to throw IOException in Slot constructor

fixed

bq. Terminology: it seems like the "anchorable" flag means "is mlocked by DN 
and can increment the refcount" and "anchor" is a refcount for "using mlocked 
data"

I like the current terminology.  "lockable" just sounds vague-- especially 
because we already have an operation which is (m)locking the block on the 
datanode, so it gets confusing to use the same term for what the client is 
doing.

bq. How do we communicate the slot index between the DN and client? I see we 
keep the slot address, but what we need to pass to the client is an index. 
Maybe this is coming.

the DN will have to pass the slot index as part of the response to the 
REQUEST_SHORT_CIRCUIT_FDS operation.  It will also pass the shared memory 
segment itself as part of that operation :)  actually, it's a bit more complex 
than that... if there is an outstanding shm segment, the DN will try to reuse 
it-- otherwise it will create a new one.  But since all the slots are the same 
size and interchangeable, the allocation is not that complex.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (HDFS-4284) BlockReaderLocal not notified of failed disks

2014-01-28 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HDFS-4284:
-

Assignee: Jimmy Xiang

> BlockReaderLocal not notified of failed disks
> -
>
> Key: HDFS-4284
> URL: https://issues.apache.org/jira/browse/HDFS-4284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Jimmy Xiang
>
> When a DN marks a disk as bad, it stops using replicas on that disk.
> However a long-running {{BlockReaderLocal}} instance will continue to access 
> replicas on the failing disk.
> Somehow we should let the in-client BlockReaderLocal know that a disk has 
> been marked as bad so that it can stop reading from the bad disk.
> From HDFS-4239:
> bq. To rephrase that, a long running BlockReaderLocal will ride over local DN 
> restarts and disk "ejections". We had to drain the RS of all its regions in 
> order to stop it from using the bad disk.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-4239) Means of telling the datanode to stop using a sick disk

2014-01-28 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884897#comment-13884897
 ] 

Jimmy Xiang commented on HDFS-4239:
---

Cool. I agree. Attached v2 that released all references to the volume marked 
down. In my test, I don't see any open file descriptor pointing to the volume 
marked down.

> Means of telling the datanode to stop using a sick disk
> ---
>
> Key: HDFS-4239
> URL: https://issues.apache.org/jira/browse/HDFS-4239
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: stack
>Assignee: Jimmy Xiang
> Attachments: hdfs-4239.patch, hdfs-4239_v2.patch
>
>
> If a disk has been deemed 'sick' -- i.e. not dead but wounded, failing 
> occasionally, or just exhibiting high latency -- your choices are:
> 1. Decommission the total datanode.  If the datanode is carrying 6 or 12 
> disks of data, especially on a cluster that is smallish -- 5 to 20 nodes -- 
> the rereplication of the downed datanode's data can be pretty disruptive, 
> especially if the cluster is doing low latency serving: e.g. hosting an hbase 
> cluster.
> 2. Stop the datanode, unmount the bad disk, and restart the datanode (You 
> can't unmount the disk while it is in use).  This latter is better in that 
> only the bad disk's data is rereplicated, not all datanode data.
> Is it possible to do better, say, send the datanode a signal to tell it stop 
> using a disk an operator has designated 'bad'.  This would be like option #2 
> above minus the need to stop and restart the datanode.  Ideally the disk 
> would become unmountable after a while.
> Nice to have would be being able to tell the datanode to restart using a disk 
> after its been replaced.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-4239) Means of telling the datanode to stop using a sick disk

2014-01-28 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-4239:
--

Attachment: hdfs-4239_v2.patch

> Means of telling the datanode to stop using a sick disk
> ---
>
> Key: HDFS-4239
> URL: https://issues.apache.org/jira/browse/HDFS-4239
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: stack
>Assignee: Jimmy Xiang
> Attachments: hdfs-4239.patch, hdfs-4239_v2.patch
>
>
> If a disk has been deemed 'sick' -- i.e. not dead but wounded, failing 
> occasionally, or just exhibiting high latency -- your choices are:
> 1. Decommission the total datanode.  If the datanode is carrying 6 or 12 
> disks of data, especially on a cluster that is smallish -- 5 to 20 nodes -- 
> the rereplication of the downed datanode's data can be pretty disruptive, 
> especially if the cluster is doing low latency serving: e.g. hosting an hbase 
> cluster.
> 2. Stop the datanode, unmount the bad disk, and restart the datanode (You 
> can't unmount the disk while it is in use).  This latter is better in that 
> only the bad disk's data is rereplicated, not all datanode data.
> Is it possible to do better, say, send the datanode a signal to tell it stop 
> using a disk an operator has designated 'bad'.  This would be like option #2 
> above minus the need to stop and restart the datanode.  Ideally the disk 
> would become unmountable after a while.
> Nice to have would be being able to tell the datanode to restart using a disk 
> after its been replaced.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-4239) Means of telling the datanode to stop using a sick disk

2014-01-28 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-4239:
--

Status: Patch Available  (was: Open)

> Means of telling the datanode to stop using a sick disk
> ---
>
> Key: HDFS-4239
> URL: https://issues.apache.org/jira/browse/HDFS-4239
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: stack
>Assignee: Jimmy Xiang
> Attachments: hdfs-4239.patch, hdfs-4239_v2.patch
>
>
> If a disk has been deemed 'sick' -- i.e. not dead but wounded, failing 
> occasionally, or just exhibiting high latency -- your choices are:
> 1. Decommission the total datanode.  If the datanode is carrying 6 or 12 
> disks of data, especially on a cluster that is smallish -- 5 to 20 nodes -- 
> the rereplication of the downed datanode's data can be pretty disruptive, 
> especially if the cluster is doing low latency serving: e.g. hosting an hbase 
> cluster.
> 2. Stop the datanode, unmount the bad disk, and restart the datanode (You 
> can't unmount the disk while it is in use).  This latter is better in that 
> only the bad disk's data is rereplicated, not all datanode data.
> Is it possible to do better, say, send the datanode a signal to tell it stop 
> using a disk an operator has designated 'bad'.  This would be like option #2 
> above minus the need to stop and restart the datanode.  Ideally the disk 
> would become unmountable after a while.
> Nice to have would be being able to tell the datanode to restart using a disk 
> after its been replaced.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884892#comment-13884892
 ] 

Hadoop QA commented on HDFS-5804:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625685/HDFS-5804.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5966//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5966//console

This message is automatically generated.

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> exception-as-root.log, javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.ne

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884886#comment-13884886
 ] 

Andrew Wang commented on HDFS-5746:
---

Thanks Colin, more comments:

* For the {{notificationSockets}} javadoc, I basically just wanted the 
explanation you gave: it's a socketpair, where the loop listens on 1, clients 
kick the loop by writing on 0.
* Can we verify the fix for racing {{sendCallback}} and {{toRemove}}? I think 
we need to check that the fd being removed is in {{entries}} before doing 
{{sendCallback}}. {{firstEntry}} also doesn't remove the entry from 
{{toRemove}}, so it looks like this inf loops. {{pollFirstEntry}} instead?
* Maybe {{remove()}} should also return a boolean "success" value too, rather 
than just swallowing an unknown socket.

Were these comments addressed?

{quote}
* Should doc that we only support one Handler per fd, it overwrites on add.
* Can add a Precondition check to make sure the lock is held in checkNotClosed
{quote}

ShortCircuitSharedMemorySegment:
* Flag constants would be more readable as "1<<63" and "1<<62" rather than 15 
zeroes (I did verify though :))
* Comment in Slot constructor talks about incrementing a refcount, but that's 
no longer happening there.
* No need to throw IOException in Slot constructor.
* Terminology: it seems like the "anchorable" flag means "is mlocked by DN and 
can increment the refcount" and "anchor" is a refcount for "using mlocked 
data". Renaming things would make this clearer, e.g. "lockable" for the flag, 
and then "lockcount" for the count. IMO, incrementing an anchor is not a great 
physical analogy :)
* How do we communicate the slot index between the DN and client? I see we keep 
the slot address, but what we need to pass to the client is an index. Maybe 
this is coming.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5746:
---

Attachment: HDFS-5746.003.patch

Fix javadoc warnings.  javac warnings are about the use of {{sun.misc.Unsafe}}, 
and are unavoidable.  Findbugs warning should be fixed (hopefully) by making 
{{DomainSocketWatcher}} a final class.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, 
> HDFS-5746.003.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884827#comment-13884827
 ] 

Daryn Sharp commented on HDFS-5804:
---

Looks good!  Just fix the javadoc and audit warnings.

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> exception-as-root.log, javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
>   at 
> org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(

[jira] [Commented] (HDFS-5771) Track progress when loading fsimage

2014-01-28 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884810#comment-13884810
 ] 

Chris Nauroth commented on HDFS-5771:
-

Hi, Haohui.  A couple of notes:
# I see there are multiple sections that do {{beginStep}}/{{endStep}} for 
{{StepType#INODES}}.  Considering the way the {{StartupProgress}} class works, 
the effect of this will be that progress jumps to 100% complete the first time 
{{endStep}} gets called.  After that, the subsequent calls to 
{{beginStep}}/{{endStep}} are no-ops.  Are all of the various inode sections 
serialized sequentially in the new format?  If so, then would it be possible to 
do the {{beginStep}} call for {{StepType#INODES}} before the first inode 
section, and then do the {{endStep}} after the last inode section?
# There is a similar situation with {{saveInodes}} and {{saveSnapshots}} trying 
to begin/end the same step.


> Track progress when loading fsimage
> ---
>
> Key: HDFS-5771
> URL: https://issues.apache.org/jira/browse/HDFS-5771
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-5698 (FSImage in protobuf)
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5771.000.patch, HDFS-5771.001.patch
>
>
> The old code that loads the fsimage tracks the progress during loading. This 
> jira proposes to implement the same functionality in the new code which 
> serializes the fsimage using protobuf..



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884795#comment-13884795
 ] 

Hadoop QA commented on HDFS-5746:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625650/HDFS-5746.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1550 javac 
compiler warnings (more than the trunk's current 1545 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5962//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5962//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5962//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5962//console

This message is automatically generated.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Abin Shahab (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated HDFS-5804:
--

Attachment: HDFS-5804.patch

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> exception-as-root.log, javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
>   at 
> org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultC

[jira] [Assigned] (HDFS-5780) TestRBWBlockInvalidation times out intemittently on branch-2

2014-01-28 Thread Mit Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reassigned HDFS-5780:
---

Assignee: Mit Desai

> TestRBWBlockInvalidation times out intemittently on branch-2
> 
>
> Key: HDFS-5780
> URL: https://issues.apache.org/jira/browse/HDFS-5780
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Mit Desai
>Assignee: Mit Desai
>
> i recently found out that the test 
> TestRBWBlockInvalidation#testBlockInvalidationWhenRBWReplicaMissedInDN times 
> out intermittently.
> I am using Fedora, JDK7



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations

2014-01-28 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-4564:
--

Attachment: HDFS-4564.patch
HDFS-4564.branch-23.patch

> Webhdfs returns incorrect http response codes for denied operations
> ---
>
> Key: HDFS-4564
> URL: https://issues.apache.org/jira/browse/HDFS-4564
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, 
> HDFS-4564.patch
>
>
> Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's 
> denying operations.  Examples including rejecting invalid proxy user attempts 
> and renew/cancel with an invalid user.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5841) Update HDFS caching documentation with new changes

2014-01-28 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5841:
--

Attachment: hdfs-5841-3.patch

Rebase, surprised this has gone stale already.

> Update HDFS caching documentation with new changes
> --
>
> Key: HDFS-5841
> URL: https://issues.apache.org/jira/browse/HDFS-5841
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: caching
> Attachments: hdfs-5841-1.patch, hdfs-5841-2.patch, hdfs-5841-3.patch
>
>
> The caching documentation is a little out of date, since it's missing 
> description of features like TTL and expiration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884715#comment-13884715
 ] 

Hadoop QA commented on HDFS-5804:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625663/HDFS-5804.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
-14 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5963//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5963//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5963//console

This message is automatically generated.

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, exception-as-root.log, 
> javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884710#comment-13884710
 ] 

Arpit Agarwal commented on HDFS-5776:
-

I've stated my concerns but if there is broad consensus we don't need caps I 
won't hold up the checkin.

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5841) Update HDFS caching documentation with new changes

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884705#comment-13884705
 ] 

Hadoop QA commented on HDFS-5841:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625660/hdfs-5841-2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5964//console

This message is automatically generated.

> Update HDFS caching documentation with new changes
> --
>
> Key: HDFS-5841
> URL: https://issues.apache.org/jira/browse/HDFS-5841
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: caching
> Attachments: hdfs-5841-1.patch, hdfs-5841-2.patch
>
>
> The caching documentation is a little out of date, since it's missing 
> description of features like TTL and expiration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884694#comment-13884694
 ] 

stack commented on HDFS-5776:
-

Thanks lads.  We are almost there.

[~xieliang007] It is better if we work through the issues here before the patch 
goes in especially while you have the attention of quality reviewers.  From 
your POV, I'm sure it a little frustrating trying to drive the patch home 
between differing opinions (The time difference doesn't help either -- smile).  
Try to salve any annoyance with the thought that, though it may appear 
otherwise, folks here are trying to work together to help get the best patch 
in.  Good on you Liang.

[~xieliang007] I'd agree with the last few [~jingzhao] review comments.  What 
you think?

[~arpitagarwal] Do you buy [~cmccabe]'s argument?  It is good by me. If you 
agree, lets shift the focus to v10 and leave the v9 style behind.

Good stuff

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Abin Shahab (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated HDFS-5804:
--

Attachment: HDFS-5804.patch

Removed all the security checks.

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, exception-as-root.log, 
> javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
>   at 
> org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty

[jira] [Updated] (HDFS-5841) Update HDFS caching documentation with new changes

2014-01-28 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5841:
--

Attachment: hdfs-5841-2.patch

Thanks for the review Colin, patch attached. I also updated the help text in 
CacheAdmin to match your recommendation.

> Update HDFS caching documentation with new changes
> --
>
> Key: HDFS-5841
> URL: https://issues.apache.org/jira/browse/HDFS-5841
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: caching
> Attachments: hdfs-5841-1.patch, hdfs-5841-2.patch
>
>
> The caching documentation is a little out of date, since it's missing 
> description of features like TTL and expiration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884573#comment-13884573
 ] 

Suresh Srinivas commented on HDFS-5776:
---

bq. We do not check other configuration settings to see if they are 
"reasonable."
[~cmccabe], I agree with the points you have made. Checking for reasonable 
value for the new config does not seem necessary.


> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5841) Update HDFS caching documentation with new changes

2014-01-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884561#comment-13884561
 ] 

Colin Patrick McCabe commented on HDFS-5841:


{code}
This can also be manually specified by "never".
{code}

This seems awkward.  How about "'never' specifies that there is no limit."

+1 once that's addressed.

> Update HDFS caching documentation with new changes
> --
>
> Key: HDFS-5841
> URL: https://issues.apache.org/jira/browse/HDFS-5841
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: caching
> Attachments: hdfs-5841-1.patch
>
>
> The caching documentation is a little out of date, since it's missing 
> description of features like TTL and expiration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884554#comment-13884554
 ] 

Colin Patrick McCabe commented on HDFS-5776:


[~arpitagarwal] : if I understand your comments correctly, you are concerned 
that hedged reads may spawn too many threads.  But that's why 
{{dfs.client.hedged.read.threadpool.size}} exists.  The {{DFSClient}} will not 
create more threads than this.

We do not check other configuration settings to see if they are "reasonable."  
For example, if someone wants to set {{dfs.balancer.dispatcherThreads}}, 
{{dfs.balancer.moverThreads}}, or {{dfs.datanode.max.transfer.threads}} to a 
zillion, we don't complain.  If we tried to set hard limits everywhere, people 
with different needs would have to recompile hadoop to meet those needs.

Please remember that, if the client wants to, he/she can sit in a loop and call 
{{new Thread(...)}}.  It's not like by giving users the ability to control the 
number of threads they use, we are opening up some new world of security 
vulnerabilities.  The ability for the client to create any number of threads 
already exists.  And it only inconveniences one person: the client themselves.

[~sureshms]: I agree that we should figure out the configuration issues here 
rather than changing the configuration in an incompatible way later.  Jing 
suggested adding "an Allow-Hedged-Reads configuration" boolean.  That certainly 
seems to solve the problem of having different threads use different settings.  
Is there any objection, besides the inelegance of having two configs rather 
than one?

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5698) Use protobuf to serialize / deserialize FSImage

2014-01-28 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884536#comment-13884536
 ] 

Suresh Srinivas commented on HDFS-5698:
---

bq.  It'll be important to know if NN's with huge images will be unable to load 
their images w/o more heap allocation.
All the objects created are short lived. Hence this should not affect NN heap 
allocation. However, it would be interesting to see the time spent in GC and 
rate of garbage creation.

> Use protobuf to serialize / deserialize FSImage
> ---
>
> Key: HDFS-5698
> URL: https://issues.apache.org/jira/browse/HDFS-5698
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5698.000.patch, HDFS-5698.001.patch
>
>
> Currently, the code serializes FSImage using in-house serialization 
> mechanisms. There are a couple disadvantages of the current approach:
> # Mixing the responsibility of reconstruction and serialization / 
> deserialization. The current code paths of serialization / deserialization 
> have spent a lot of effort on maintaining compatibility. What is worse is 
> that they are mixed with the complex logic of reconstructing the namespace, 
> making the code difficult to follow.
> # Poor documentation of the current FSImage format. The format of the FSImage 
> is practically defined by the implementation. An bug in implementation means 
> a bug in the specification. Furthermore, it also makes writing third-party 
> tools quite difficult.
> # Changing schemas is non-trivial. Adding a field in FSImage requires bumping 
> the layout version every time. Bumping out layout version requires (1) the 
> users to explicitly upgrade the clusters, and (2) putting new code to 
> maintain backward compatibility.
> This jira proposes to use protobuf to serialize the FSImage. Protobuf has 
> been used to serialize / deserialize the RPC message in Hadoop.
> Protobuf addresses all the above problems. It clearly separates the 
> responsibility of serialization and reconstructing the namespace. The 
> protobuf files document the current format of the FSImage. The developers now 
> can add optional fields with ease, since the old code can always read the new 
> FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5746) add ShortCircuitSharedMemorySegment

2014-01-28 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5746:
---

Attachment: HDFS-5746.002.patch

bq. I didn't see anything named ShortCircuitSharedMemorySegment in the patch, 
should it be included?

It should be there...

bq. Javadoc for SharedFileDescriptorFactory constructor

added

bq. rand() isn't reentrant, potentially making it unsafe for createDescriptor0. 
Should we use rand_r instead, or slap a synchronized on it?

Apparently, on Linux rand is re-entrant because glibc puts a mutex around it.  
But you're right, we should be POSIX-compliant here.  I added a mutex around 
rand.  Using the reentrant versions would be awkward because of the need to 
pass around state somehow (probably a java array).

bq. Also not sure why we concat two rand(). Seems like one should be enough 
with the collision detection code.

fair enough.

bq. The open is done with mode 0777, wouldn't 0700 be safer? I thought we were 
passing these over a domain socket, so we can keep the permissions locked up.

Good point.  we don't want random users to be able to open this file during the 
brief period it exists in the namespace.

bq. Paranoia, should we do a check in CloseableReferenceCount#reference for 
overflow to the closed bit? I know we have 30 bits, but who knows.

Well, this code was just moved from DomainSocket.java, not changed.

The issue is that we want to use atomic addition, not compare-and-exchange, for 
speed.  Given that, all we know is the state after the addition, not before.  
This is fairly performance-critical for UNIX domain sockets (it has to do this 
before every socket operation) so it has to be fast.  The failure mode also 
seems fairly benign: the refcount overflows into the closed bit and causes the 
socket to appear closed.

At some point we should evaulate a 64-bit counter.  It might be just as fast on 
64-bit machines.

bq. Unrelated nit: DomainSocket#write(byte[], int, int) boolean exec is 
indented wrong, mind fixing it?

ok

bq. \[DomainSocketWatcher\] javadoc is c+p from DomainSocket, I think it should 
be updated for DSW. Some high-level description of how the nested classes fit 
together would be nice.

added

bq. Some Java-isms. Runnable is preferred over Thread. It's also weird that DSW 
is a Thread subclass and it calls start on itself. An inner class implementing 
Runnable would be more idiomatic.

It's kind of annoying that using an inner Runnable class would increase the 
indentation of run().  Still, I suppose it does provide better isolation, 
making it impossible to invoke random Thread methods on the 
DomainSocketWatcher.  So I will implement that.

bq. Explain use of loopSocks 0 versus loopSocks 1? This is a crucial part of 
this class: we need to use a socketpair rather than a plain condition variable 
because of blocking on poll.

It's arbitrary: both sockets are connected to one another and exactly alike.  I 
chose to listen on 1 and write on 0, but I could easily have made the opposite 
choice.

bq. "loopSocks" is also not a very descriptive name, maybe "wakeupPair" or 
"eventPair" instead?

I changed it to {{notificationSockets}}.

Can add a Precondition check to make sure the lock is held in checkNotClosed
If we fail to kick, add and remove could block until the poll timeout.
Should doc that we only support one Handler per fd, it overwrites on add. Maybe 
Precondition this instead if we don't want to overwrite, I can't tell from 
context here.

bq. The repeated calls to sendCallback are worrisome. For instance, a sock 
could be EOF and closed, be removed by the first sendCallback, and then if 
there's a pending toRemove for the sock, the second sendCallback aborts on the 
Precondition check.

Good catch.  Fixed.

bq. closeAll parameter in sendCallback is unused

removed

bq. This comment probably means to refer to loopSocks: // Close 
shutdownSocketPair\[0\], so that shutdownSocketPair\[1\] gets an EOF

ok

bq. This comment probably meant poll, not select: // were waiting in select().

ok

bq. Why are two of the @Test in TestDomainSocketWatcher commented out?

fixed

bq. Timeouts seem kind of long, these should be super fast tests right?

reduced.  I didn't want to reduce too much to avoid flakiness.

> add ShortCircuitSharedMemorySegment
> ---
>
> Key: HDFS-5746
> URL: https://issues.apache.org/jira/browse/HDFS-5746
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate 
> information between the datanode and the client about whether a replica is 
> mlock

[jira] [Commented] (HDFS-5698) Use protobuf to serialize / deserialize FSImage

2014-01-28 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884531#comment-13884531
 ] 

Daryn Sharp commented on HDFS-5698:
---

You may want to investigate if the inodemap will perform better with a 
{{ConcurrentHashMap}} than a {{LightWeightGSet}}.  That will increase the 
parallelism of the map insertion.  I think the gset  was chosen for memory 
concerns.

Assuming you plan to parallelize the parent/child linkages, I think the 
{{addChild}} may need to be in a synchronized block unless the inodeMap is made 
concurrent.  I'm not a snapshot expert, but I wonder how thread-safe the 
snapshot manager is.  Are the directory diffs constructed "on the fly" during 
addition of the children, or are they stored separately in the fsimage?

We just need to be certain it's actually feasible to offset a ~2X increase in 
load time.

Also, did you happen to gather heap usage statistics?  Is part of the load 
increase maybe due to increased GC?  It'll be important to know if NN's with 
huge images will be unable to load their images w/o more heap allocation.

> Use protobuf to serialize / deserialize FSImage
> ---
>
> Key: HDFS-5698
> URL: https://issues.apache.org/jira/browse/HDFS-5698
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5698.000.patch, HDFS-5698.001.patch
>
>
> Currently, the code serializes FSImage using in-house serialization 
> mechanisms. There are a couple disadvantages of the current approach:
> # Mixing the responsibility of reconstruction and serialization / 
> deserialization. The current code paths of serialization / deserialization 
> have spent a lot of effort on maintaining compatibility. What is worse is 
> that they are mixed with the complex logic of reconstructing the namespace, 
> making the code difficult to follow.
> # Poor documentation of the current FSImage format. The format of the FSImage 
> is practically defined by the implementation. An bug in implementation means 
> a bug in the specification. Furthermore, it also makes writing third-party 
> tools quite difficult.
> # Changing schemas is non-trivial. Adding a field in FSImage requires bumping 
> the layout version every time. Bumping out layout version requires (1) the 
> users to explicitly upgrade the clusters, and (2) putting new code to 
> maintain backward compatibility.
> This jira proposes to use protobuf to serialize the FSImage. Protobuf has 
> been used to serialize / deserialize the RPC message in Hadoop.
> Protobuf addresses all the above problems. It clearly separates the 
> responsibility of serialization and reconstructing the namespace. The 
> protobuf files document the current format of the FSImage. The developers now 
> can add optional fields with ease, since the old code can always read the new 
> FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5698) Use protobuf to serialize / deserialize FSImage

2014-01-28 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884471#comment-13884471
 ] 

Haohui Mai commented on HDFS-5698:
--

Our profiling results show that the parsing the bytes and constructing the 
protobuf objects take significant amount of time.

The work is parallelized like the following:

{code}
while (has data) {
  bytes[] data = read();
  thread_pool.submit(parse_data(data));
}

parse_data_for_inode(data) {
  INode inode = construct(data);
  synchronized(inodemap) {
inodemap.add(inode);
  }
  block_map_thread_pool.submit(update_block_map(data));
}

parse_data_for_inode_dir(data) {
  foreach (children : data.getChildren())
inodemap.get(data.getparent()).addChild(inodemap.get(children))
}
{code}

Two things are worth noting. (1) The contention only happens when adding the 
inode into the inodemap.  (2) Updating the block maps happens in parallel. Our 
profiling results show that updating the block maps can take up to 20% of the 
execution time. The latency can be hidden in the above implementation.

I've only tested an early prototype on my laptop. With 4 threads it brings the 
load latency comparable to the old format. To report comparable numbers, 
however, I'll need to update the code and rerun the test on the machine that I 
ran my previous tests.


> Use protobuf to serialize / deserialize FSImage
> ---
>
> Key: HDFS-5698
> URL: https://issues.apache.org/jira/browse/HDFS-5698
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5698.000.patch, HDFS-5698.001.patch
>
>
> Currently, the code serializes FSImage using in-house serialization 
> mechanisms. There are a couple disadvantages of the current approach:
> # Mixing the responsibility of reconstruction and serialization / 
> deserialization. The current code paths of serialization / deserialization 
> have spent a lot of effort on maintaining compatibility. What is worse is 
> that they are mixed with the complex logic of reconstructing the namespace, 
> making the code difficult to follow.
> # Poor documentation of the current FSImage format. The format of the FSImage 
> is practically defined by the implementation. An bug in implementation means 
> a bug in the specification. Furthermore, it also makes writing third-party 
> tools quite difficult.
> # Changing schemas is non-trivial. Adding a field in FSImage requires bumping 
> the layout version every time. Bumping out layout version requires (1) the 
> users to explicitly upgrade the clusters, and (2) putting new code to 
> maintain backward compatibility.
> This jira proposes to use protobuf to serialize the FSImage. Protobuf has 
> been used to serialize / deserialize the RPC message in Hadoop.
> Protobuf addresses all the above problems. It clearly separates the 
> responsibility of serialization and reconstructing the namespace. The 
> protobuf files document the current format of the FSImage. The developers now 
> can add optional fields with ease, since the old code can always read the new 
> FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5828) BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same node group when dfs.namenode.avoid.write.stale.datanode is true

2014-01-28 Thread Buddy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Buddy updated HDFS-5828:


Attachment: HDFS-5828.patch

The problem was BlockPlacementPolicyDefault.chooseTarget was manually adding 
the nodes in the results list to the excluded nodes list instead of using the 
addToExcludedNodes method. 

The addToExcludedNodes method is overridden by 
BlockPlacementPolicyWithNodeGroup to also exclude other nodes in the same node 
group.


> BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same 
> node group when dfs.namenode.avoid.write.stale.datanode is true
> -
>
> Key: HDFS-5828
> URL: https://issues.apache.org/jira/browse/HDFS-5828
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Buddy
> Attachments: HDFS-5828.patch
>
>
> When placing replicas using the replica placement policy 
> BlockPlacementPolicyWithNodeGroup, the number of targets returned should be 
> less than or equal to the number of node groups and no node group should get 
> two replicas of the same block. The Junit test 
> TestReplicationPolicyWithNodeGroup.testChooseMoreTargetsThanNodeGroups 
> verifies this.
> However, if the conf property "dfs.namenode.avoid.write.stale.datanode" is 
> set to true, then block placement policy will return more targets than node 
> groups when the number of replicas requested exceeds the number of node 
> groups.
> This can be seen by putting:
>CONF.setBoolean(DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true);
> in the setup method for TestReplicationPolicyWithNodeGroup. This will cause 
> testChooseMoreTargetsThanNodeGroups to fail.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5804) HDFS NFS Gateway fails to mount and proxy when using Kerberos

2014-01-28 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884455#comment-13884455
 ] 

Daryn Sharp commented on HDFS-5804:
---

Are the other {{isSecurityEnabled}} checks still required?

> HDFS NFS Gateway fails to mount and proxy when using Kerberos
> -
>
> Key: HDFS-5804
> URL: https://issues.apache.org/jira/browse/HDFS-5804
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Abin Shahab
> Attachments: HDFS-5804.patch, HDFS-5804.patch, HDFS-5804.patch, 
> HDFS-5804.patch, HDFS-5804.patch, exception-as-root.log, 
> javadoc-after-patch.log, javadoc-before-patch.log
>
>
> When using HDFS nfs gateway with secure hadoop 
> (hadoop.security.authentication: kerberos), mounting hdfs fails. 
> Additionally, there is no mechanism to support proxy user(nfs needs to proxy 
> as the user invoking commands on the hdfs mount).
> Steps to reproduce:
> 1) start a hadoop cluster with kerberos enabled.
> 2) sudo su -l nfsserver and start an nfs server. This 'nfsserver' account has 
> a an account in kerberos.
> 3) Get the keytab for nfsserver, and issue the following mount command: mount 
> -t nfs -o vers=3,proto=tcp,nolock $server:/  $mount_point
> 4) You'll see in the nfsserver logs that Kerberos is complaining about not 
> having a TGT for root.
> This is the stacktrace: 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "my-nfs-server-host.com/10.252.4.197"; destination host is: 
> "my-namenode-host.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy9.getFileLinkInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileLinkInfo(ClientNamenodeProtocolTranslatorPB.java:664)
>   at org.apache.hadoop.hdfs.DFSClient.getFileLinkInfo(DFSClient.java:1713)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileStatus(Nfs3Utils.java:58)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.Nfs3Utils.getFileAttr(Nfs3Utils.java:79)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsinfo(RpcProgramNfs3.java:1643)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1891)
>   at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:143)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
>   at 
> org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:5

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884361#comment-13884361
 ] 

Suresh Srinivas commented on HDFS-5776:
---

bq. Could we create another JIRA to track those disagreement? I have said more 
than three times: the default pool size is 0, so no hurt for all of existing 
applications by default.
The fact that the issue is brought up many times means that there is an issue 
that needs to be discussed and resolved.

bq. I guess it's possible cost one week, one month even one year to argue 
them...
If takes more time, so be it. There are many committers who have spent time 
reviewing and commenting. I understand this is an important feature and the 
need to get it done sooner. But the core issues must be solved in this jira 
instead of pushing it to another jira.

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884351#comment-13884351
 ] 

Arpit Agarwal commented on HDFS-5776:
-

[~stack] I am basically +1 on the v9 patch at this point but v10 is a step 
back. We need a throttle on unbounded thread growth and threadpool size is the 
most trivial to add. We can file a separate Jira to replace the thread pool 
limit with something more sophisticated e.g. the client can keep a dynamic 
estimate of the 95th percentile latency and use that instead of a fixed value 
from configuration.

Jing mentioned some issues that look fairly easy to address.

{quote}
In the old impl, the refetchToken/refetchEncryptionKey are shared by all nodes 
from chooseDataNode once key/token exception happened. that means if the first 
node consumed this retry quota, then if the second or third node hit the 
key/token exception, clearDataEncryptionKey/fetchBlockAt opeerations will not 
be called, it's a little unfair
{quote}
[~xieliang007] That makes sense, thanks for the clarification.

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5698) Use protobuf to serialize / deserialize FSImage

2014-01-28 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884307#comment-13884307
 ] 

Kihwal Lee commented on HDFS-5698:
--

Thanks for running tests and sharing the numbers. I did some testing In the 
past and the loading speed was about 30MB/sec at best. I/O wasn't the 
bottleneck. THP and CompressedOOPS help a bit, but in the end the bottleneck 
was java object creations. Due to the way things are serialized, multi-threaded 
loading wasn't feasible.

Now that we have the inode section and the inode directory section separated, 
parallelism can be added for loading each section. Please share your 
implementation ideas.  The parallelism may come out far less than expected due 
to internal locks. So it will be great if a rough prototype & testing is done 
to show what's attainable. Do you already have numbers for how long it took to 
load each section?

> Use protobuf to serialize / deserialize FSImage
> ---
>
> Key: HDFS-5698
> URL: https://issues.apache.org/jira/browse/HDFS-5698
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5698.000.patch, HDFS-5698.001.patch
>
>
> Currently, the code serializes FSImage using in-house serialization 
> mechanisms. There are a couple disadvantages of the current approach:
> # Mixing the responsibility of reconstruction and serialization / 
> deserialization. The current code paths of serialization / deserialization 
> have spent a lot of effort on maintaining compatibility. What is worse is 
> that they are mixed with the complex logic of reconstructing the namespace, 
> making the code difficult to follow.
> # Poor documentation of the current FSImage format. The format of the FSImage 
> is practically defined by the implementation. An bug in implementation means 
> a bug in the specification. Furthermore, it also makes writing third-party 
> tools quite difficult.
> # Changing schemas is non-trivial. Adding a field in FSImage requires bumping 
> the layout version every time. Bumping out layout version requires (1) the 
> users to explicitly upgrade the clusters, and (2) putting new code to 
> maintain backward compatibility.
> This jira proposes to use protobuf to serialize the FSImage. Protobuf has 
> been used to serialize / deserialize the RPC message in Hadoop.
> Protobuf addresses all the above problems. It clearly separates the 
> responsibility of serialization and reconstructing the namespace. The 
> protobuf files document the current format of the FSImage. The developers now 
> can add optional fields with ease, since the old code can always read the new 
> FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5828) BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same node group when dfs.namenode.avoid.write.stale.datanode is true

2014-01-28 Thread Buddy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884251#comment-13884251
 ] 

Buddy commented on HDFS-5828:
-

The reason that it was sometimes succeeding for me is that I was in the 
debugger and the node was sometimes going stale (30 seconds). If the node is 
not stale, then it always fails.

Also node that logNodeIsNotChosen does not actually log anything, it just 
builds the message. The message is not logged in this case.

 

> BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same 
> node group when dfs.namenode.avoid.write.stale.datanode is true
> -
>
> Key: HDFS-5828
> URL: https://issues.apache.org/jira/browse/HDFS-5828
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Buddy
>
> When placing replicas using the replica placement policy 
> BlockPlacementPolicyWithNodeGroup, the number of targets returned should be 
> less than or equal to the number of node groups and no node group should get 
> two replicas of the same block. The Junit test 
> TestReplicationPolicyWithNodeGroup.testChooseMoreTargetsThanNodeGroups 
> verifies this.
> However, if the conf property "dfs.namenode.avoid.write.stale.datanode" is 
> set to true, then block placement policy will return more targets than node 
> groups when the number of replicas requested exceeds the number of node 
> groups.
> This can be seen by putting:
>CONF.setBoolean(DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true);
> in the setup method for TestReplicationPolicyWithNodeGroup. This will cause 
> testChooseMoreTargetsThanNodeGroups to fail.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5828) BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same node group when dfs.namenode.avoid.write.stale.datanode is true

2014-01-28 Thread Buddy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884236#comment-13884236
 ] 

Buddy commented on HDFS-5828:
-

The failure appears to be non-deterministic.
In some cases the first chooseLocalStorage throws an exception and we get the 
message:

2014-01-28 10:12:25,981 WARN  blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(309)) - Failed to place enough 
replicas, still in need of 10 to reach 10. For more information, please enable 
DEBUG log level on 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy

When that happens, the unit test succeeds.

If chooseLocalStorage finds a local storage and does not throw an exception, 
then the above message is not logged and the unit test fails.




> BlockPlacementPolicyWithNodeGroup can place multiple replicas on the same 
> node group when dfs.namenode.avoid.write.stale.datanode is true
> -
>
> Key: HDFS-5828
> URL: https://issues.apache.org/jira/browse/HDFS-5828
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Buddy
>
> When placing replicas using the replica placement policy 
> BlockPlacementPolicyWithNodeGroup, the number of targets returned should be 
> less than or equal to the number of node groups and no node group should get 
> two replicas of the same block. The Junit test 
> TestReplicationPolicyWithNodeGroup.testChooseMoreTargetsThanNodeGroups 
> verifies this.
> However, if the conf property "dfs.namenode.avoid.write.stale.datanode" is 
> set to true, then block placement policy will return more targets than node 
> groups when the number of replicas requested exceeds the number of node 
> groups.
> This can be seen by putting:
>CONF.setBoolean(DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true);
> in the setup method for TestReplicationPolicyWithNodeGroup. This will cause 
> testChooseMoreTargetsThanNodeGroups to fail.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5825) Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884132#comment-13884132
 ] 

Hudson commented on HDFS-5825:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1656 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1656/])
HDFS-5825. Use FileUtils.copyFile() to implement DFSTestUtils.copyFile(). 
(Contributed by Haohui Mai) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561792)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java


> Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()
> -
>
> Key: HDFS-5825
> URL: https://issues.apache.org/jira/browse/HDFS-5825
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-5825.000.patch
>
>
> {{DFSTestUtils.copyFile()}} is implemented by copying data through 
> FileInputStream / FileOutputStream. Apache Common IO provides 
> {{FileUtils.copyFile()}}. It uses FileChannel which is more efficient.
> This jira proposes to implement {{DFSTestUtils.copyFile()}} using 
> {{FileUtils.copyFile()}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5830) WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when accessing another cluster.

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884136#comment-13884136
 ] 

Hudson commented on HDFS-5830:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1656 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1656/])
HDFS-5830. WebHdfsFileSystem.getFileBlockLocations throws 
IllegalArgumentException when accessing another cluster. (Yongjun Zhang via 
Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561885)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java


> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing another cluster. 
> 
>
> Key: HDFS-5830
> URL: https://issues.apache.org/jira/browse/HDFS-5830
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, hdfs-client
>Affects Versions: 2.3.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HDFS-5830.001.patch
>
>
> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing a another cluster (that doesn't have caching support). 
> java.lang.IllegalArgumentException: cachedLocs should not be null, use a 
> different constructor
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
> at org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:79)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlock(JsonUtil.java:414)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlockList(JsonUtil.java:446)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlocks(JsonUtil.java:479)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1067)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1812)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1797)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5781) Use an array to record the mapping between FSEditLogOpCode and the corresponding byte value

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884134#comment-13884134
 ] 

Hudson commented on HDFS-5781:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1656 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1656/])
HDFS-5781. Use an array to record the mapping between FSEditLogOpCode and the 
corresponding byte value. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561788)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java


> Use an array to record the mapping between FSEditLogOpCode and the 
> corresponding byte value
> ---
>
> Key: HDFS-5781
> URL: https://issues.apache.org/jira/browse/HDFS-5781
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 2.4.0
>
> Attachments: HDFS-5781.000.patch, HDFS-5781.001.patch, 
> HDFS-5781.002.patch, HDFS-5781.002.patch
>
>
> HDFS-5674 uses Enum.values and enum.ordinal to identify an editlog op for a 
> given byte value. While improving the efficiency, it may cause issue. E.g., 
> when several new editlog ops are added to trunk around the same time (for 
> several different new features), it is hard to backport the editlog ops with 
> larger byte values to branch-2 before those with smaller values, since there 
> will be gaps in the byte values of the enum. 
> This jira plans to still use an array to record the mapping between editlog 
> ops and their byte values, and allow gap between valid ops. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5833) SecondaryNameNode have an incorrect java doc

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884135#comment-13884135
 ] 

Hudson commented on HDFS-5833:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1656 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1656/])
HDFS-5833. Fix incorrect javadoc in SecondaryNameNode. (Contributed by Bangtao 
Zhou) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561938)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java


> SecondaryNameNode have an incorrect java doc
> 
>
> Key: HDFS-5833
> URL: https://issues.apache.org/jira/browse/HDFS-5833
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Bangtao Zhou
>Priority: Trivial
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5833-1.patch
>
>
> SecondaryNameNode have an incorrect java doc, actually the SecondaryNameNode 
> uses the *NamenodeProtocol* to talk to the primary NameNode, not the 
> *ClientProtocol*



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5297) Fix dead links in HDFS site documents

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884130#comment-13884130
 ] 

Hudson commented on HDFS-5297:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1656 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1656/])
HDFS-5297. Fix dead links in HDFS site documents. (Contributed by Akira 
Ajisaka) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561849)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Federation.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsPermissionsGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsQuotaAdminGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsUserGuide.apt.vm
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Hftp.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


> Fix dead links in HDFS site documents
> -
>
> Key: HDFS-5297
> URL: https://issues.apache.org/jira/browse/HDFS-5297
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5297.patch
>
>
> I found a lot of broken hyperlinks in HDFS document to be fixed.
> Ex.)
> In HdfsUserGuide.apt.vm, there is an broken hyperlinks as below
> {noformat}
>For command usage, see {{{dfsadmin}}}.
> {noformat}
> It should be fixed to 
> {noformat}
>For command usage, see 
> {{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5830) WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when accessing another cluster.

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884119#comment-13884119
 ] 

Hudson commented on HDFS-5830:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1681 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1681/])
HDFS-5830. WebHdfsFileSystem.getFileBlockLocations throws 
IllegalArgumentException when accessing another cluster. (Yongjun Zhang via 
Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561885)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java


> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing another cluster. 
> 
>
> Key: HDFS-5830
> URL: https://issues.apache.org/jira/browse/HDFS-5830
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, hdfs-client
>Affects Versions: 2.3.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HDFS-5830.001.patch
>
>
> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing a another cluster (that doesn't have caching support). 
> java.lang.IllegalArgumentException: cachedLocs should not be null, use a 
> different constructor
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
> at org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:79)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlock(JsonUtil.java:414)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlockList(JsonUtil.java:446)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlocks(JsonUtil.java:479)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1067)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1812)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1797)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5781) Use an array to record the mapping between FSEditLogOpCode and the corresponding byte value

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884117#comment-13884117
 ] 

Hudson commented on HDFS-5781:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1681 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1681/])
HDFS-5781. Use an array to record the mapping between FSEditLogOpCode and the 
corresponding byte value. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561788)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java


> Use an array to record the mapping between FSEditLogOpCode and the 
> corresponding byte value
> ---
>
> Key: HDFS-5781
> URL: https://issues.apache.org/jira/browse/HDFS-5781
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 2.4.0
>
> Attachments: HDFS-5781.000.patch, HDFS-5781.001.patch, 
> HDFS-5781.002.patch, HDFS-5781.002.patch
>
>
> HDFS-5674 uses Enum.values and enum.ordinal to identify an editlog op for a 
> given byte value. While improving the efficiency, it may cause issue. E.g., 
> when several new editlog ops are added to trunk around the same time (for 
> several different new features), it is hard to backport the editlog ops with 
> larger byte values to branch-2 before those with smaller values, since there 
> will be gaps in the byte values of the enum. 
> This jira plans to still use an array to record the mapping between editlog 
> ops and their byte values, and allow gap between valid ops. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5297) Fix dead links in HDFS site documents

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884113#comment-13884113
 ] 

Hudson commented on HDFS-5297:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1681 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1681/])
HDFS-5297. Fix dead links in HDFS site documents. (Contributed by Akira 
Ajisaka) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561849)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Federation.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsPermissionsGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsQuotaAdminGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsUserGuide.apt.vm
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Hftp.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


> Fix dead links in HDFS site documents
> -
>
> Key: HDFS-5297
> URL: https://issues.apache.org/jira/browse/HDFS-5297
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5297.patch
>
>
> I found a lot of broken hyperlinks in HDFS document to be fixed.
> Ex.)
> In HdfsUserGuide.apt.vm, there is an broken hyperlinks as below
> {noformat}
>For command usage, see {{{dfsadmin}}}.
> {noformat}
> It should be fixed to 
> {noformat}
>For command usage, see 
> {{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5825) Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884115#comment-13884115
 ] 

Hudson commented on HDFS-5825:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1681 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1681/])
HDFS-5825. Use FileUtils.copyFile() to implement DFSTestUtils.copyFile(). 
(Contributed by Haohui Mai) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561792)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java


> Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()
> -
>
> Key: HDFS-5825
> URL: https://issues.apache.org/jira/browse/HDFS-5825
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-5825.000.patch
>
>
> {{DFSTestUtils.copyFile()}} is implemented by copying data through 
> FileInputStream / FileOutputStream. Apache Common IO provides 
> {{FileUtils.copyFile()}}. It uses FileChannel which is more efficient.
> This jira proposes to implement {{DFSTestUtils.copyFile()}} using 
> {{FileUtils.copyFile()}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5833) SecondaryNameNode have an incorrect java doc

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884118#comment-13884118
 ] 

Hudson commented on HDFS-5833:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1681 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1681/])
HDFS-5833. Fix incorrect javadoc in SecondaryNameNode. (Contributed by Bangtao 
Zhou) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561938)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java


> SecondaryNameNode have an incorrect java doc
> 
>
> Key: HDFS-5833
> URL: https://issues.apache.org/jira/browse/HDFS-5833
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Bangtao Zhou
>Priority: Trivial
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5833-1.patch
>
>
> SecondaryNameNode have an incorrect java doc, actually the SecondaryNameNode 
> uses the *NamenodeProtocol* to talk to the primary NameNode, not the 
> *ClientProtocol*



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5730) Inconsistent Audit logging for HDFS APIs

2014-01-28 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884103#comment-13884103
 ] 

Uma Maheswara Rao G commented on HDFS-5730:
---

Thanks a lot, Colin for taking a look. More reviews are welcomed.

> Inconsistent Audit logging for HDFS APIs
> 
>
> Key: HDFS-5730
> URL: https://issues.apache.org/jira/browse/HDFS-5730
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-5730.patch, HDFS-5730.patch
>
>
> When looking at the audit loggs in HDFS, I am seeing some inconsistencies 
> what was logged with audit and what is added recently.
> For more details please check the comments.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5830) WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when accessing another cluster.

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884013#comment-13884013
 ] 

Hudson commented on HDFS-5830:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #464 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/464/])
HDFS-5830. WebHdfsFileSystem.getFileBlockLocations throws 
IllegalArgumentException when accessing another cluster. (Yongjun Zhang via 
Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561885)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java


> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing another cluster. 
> 
>
> Key: HDFS-5830
> URL: https://issues.apache.org/jira/browse/HDFS-5830
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, hdfs-client
>Affects Versions: 2.3.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HDFS-5830.001.patch
>
>
> WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when 
> accessing a another cluster (that doesn't have caching support). 
> java.lang.IllegalArgumentException: cachedLocs should not be null, use a 
> different constructor
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
> at org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:79)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlock(JsonUtil.java:414)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlockList(JsonUtil.java:446)
> at org.apache.hadoop.hdfs.web.JsonUtil.toLocatedBlocks(JsonUtil.java:479)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1067)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1812)
> at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1797)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5781) Use an array to record the mapping between FSEditLogOpCode and the corresponding byte value

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884011#comment-13884011
 ] 

Hudson commented on HDFS-5781:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #464 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/464/])
HDFS-5781. Use an array to record the mapping between FSEditLogOpCode and the 
corresponding byte value. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561788)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java


> Use an array to record the mapping between FSEditLogOpCode and the 
> corresponding byte value
> ---
>
> Key: HDFS-5781
> URL: https://issues.apache.org/jira/browse/HDFS-5781
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 2.4.0
>
> Attachments: HDFS-5781.000.patch, HDFS-5781.001.patch, 
> HDFS-5781.002.patch, HDFS-5781.002.patch
>
>
> HDFS-5674 uses Enum.values and enum.ordinal to identify an editlog op for a 
> given byte value. While improving the efficiency, it may cause issue. E.g., 
> when several new editlog ops are added to trunk around the same time (for 
> several different new features), it is hard to backport the editlog ops with 
> larger byte values to branch-2 before those with smaller values, since there 
> will be gaps in the byte values of the enum. 
> This jira plans to still use an array to record the mapping between editlog 
> ops and their byte values, and allow gap between valid ops. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5825) Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884009#comment-13884009
 ] 

Hudson commented on HDFS-5825:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #464 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/464/])
HDFS-5825. Use FileUtils.copyFile() to implement DFSTestUtils.copyFile(). 
(Contributed by Haohui Mai) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561792)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java


> Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()
> -
>
> Key: HDFS-5825
> URL: https://issues.apache.org/jira/browse/HDFS-5825
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-5825.000.patch
>
>
> {{DFSTestUtils.copyFile()}} is implemented by copying data through 
> FileInputStream / FileOutputStream. Apache Common IO provides 
> {{FileUtils.copyFile()}}. It uses FileChannel which is more efficient.
> This jira proposes to implement {{DFSTestUtils.copyFile()}} using 
> {{FileUtils.copyFile()}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5833) SecondaryNameNode have an incorrect java doc

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884012#comment-13884012
 ] 

Hudson commented on HDFS-5833:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #464 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/464/])
HDFS-5833. Fix incorrect javadoc in SecondaryNameNode. (Contributed by Bangtao 
Zhou) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561938)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java


> SecondaryNameNode have an incorrect java doc
> 
>
> Key: HDFS-5833
> URL: https://issues.apache.org/jira/browse/HDFS-5833
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Bangtao Zhou
>Priority: Trivial
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5833-1.patch
>
>
> SecondaryNameNode have an incorrect java doc, actually the SecondaryNameNode 
> uses the *NamenodeProtocol* to talk to the primary NameNode, not the 
> *ClientProtocol*



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5297) Fix dead links in HDFS site documents

2014-01-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884007#comment-13884007
 ] 

Hudson commented on HDFS-5297:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #464 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/464/])
HDFS-5297. Fix dead links in HDFS site documents. (Contributed by Akira 
Ajisaka) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561849)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Federation.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsPermissionsGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsQuotaAdminGuide.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsUserGuide.apt.vm
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Hftp.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


> Fix dead links in HDFS site documents
> -
>
> Key: HDFS-5297
> URL: https://issues.apache.org/jira/browse/HDFS-5297
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5297.patch
>
>
> I found a lot of broken hyperlinks in HDFS document to be fixed.
> Ex.)
> In HdfsUserGuide.apt.vm, there is an broken hyperlinks as below
> {noformat}
>For command usage, see {{{dfsadmin}}}.
> {noformat}
> It should be fixed to 
> {noformat}
>For command usage, see 
> {{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883974#comment-13883974
 ] 

Hadoop QA commented on HDFS-5844:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625540/HDFS-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5961//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5961//console

This message is automatically generated.

> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Liang Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883944#comment-13883944
 ] 

Liang Xie commented on HDFS-5776:
-

Could we create another JIRA to track those disagreement? I have said more than 
three times: the default pool size is 0, so no hurt for all of existing 
applications by default. I guess it's possible cost one week, one month even 
one year to argue them...
Thanks

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883940#comment-13883940
 ] 

Jing Zhao commented on HDFS-5776:
-

Another thing for enoughNodesForHedgedRead. The current patch checks 
enoughNodesForHedgedRead before calling hedgedFetchBlockByteRange. Since the 
deadnodes keeps being updated while reading, we may still hit the issue where 
we could not easily find the second DN for reading. I think a better way is to 
add this check in chooseDataNode: if chooseDataNode finds that this is for 
seeking the second DN (if ignored is not null), and it could not 
immediately/easily find a DN, the chooseDataNode should skip retrying and we 
may want to fall back to the normal read.

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient

2014-01-28 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883928#comment-13883928
 ] 

Jing Zhao commented on HDFS-5776:
-

bq.  it's more flexible if we provide instance level disable/enable APIs, so we 
can archive to use the hbase shell script to control the switch per dfs client 
instance, that'll be cooler
I still have some concern about the current implementation: 
1) we do not check threadpool in enableHedgedReads. This makes it possible that 
isHedgedReadsEnabled() returns true while hedged read is actually not enabled.
2) DFSClient#setThreadsNumForHedgedReads allows users to keep changing the size 
of the thread pool.
To provide instance level disable/enable APIs, I think maybe we can do the 
following:
1) Read the thread pool size configuration only when initializing the thread 
pool, and the size should be >0 and cannot be changed.
2) Add an "Allow-Hedged-Reads" configuration. Each DFSClient instance reads 
this configuration, and if it is true, checks and initializes the thread pool 
if necessary. Users can turn on/off the switch using the enable/disable 
methods. In the enable method, we check and initialize the thread pool if 
necessary.

What do you think [~xieliang007]?

> Support 'hedged' reads in DFSClient
> ---
>
> Key: HDFS-5776
> URL: https://issues.apache.org/jira/browse/HDFS-5776
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-5776-v10.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, 
> HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, 
> HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5843) DFSClient.getFileChecksum() throws IOException if checksum is disabled

2014-01-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883894#comment-13883894
 ] 

Hadoop QA commented on HDFS-5843:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625529/hdfs-5843.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestPersistBlocks

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5960//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5960//console

This message is automatically generated.

> DFSClient.getFileChecksum() throws IOException if checksum is disabled
> --
>
> Key: HDFS-5843
> URL: https://issues.apache.org/jira/browse/HDFS-5843
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Laurent Goujon
> Attachments: hdfs-5843.patch
>
>
> If a file is created with checksum disabled (using {{ChecksumOpt.disabled()}} 
> for example), calling {{FileSystem.getFileChecksum()}} throws the following 
> IOException:
> {noformat}
> java.io.IOException: Fail to get block MD5 for 
> BP-341493254-192.168.1.10-1390888724459:blk_1073741825_1001
>   at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1965)
>   at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1771)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1186)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1194)
> [...]
> {noformat}
> From the logs, the datanode is doing some wrong arithmetics because of the 
> crcPerBlock:
> {noformat}
> 2014-01-27 21:58:46,329 ERROR datanode.DataNode (DataXceiver.java:run(225)) - 
> 127.0.0.1:52398:DataXceiver error processing BLOCK_CHECKSUM operation  src: 
> /127.0.0.1:52407 dest: /127.0.0.1:52398
> java.lang.ArithmeticException: / by zero
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.blockChecksum(DataXceiver.java:658)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opBlockChecksum(Receiver.java:169)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
>   at java.lang.Thread.run(Thread.java:695)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5297) Fix dead links in HDFS site documents

2014-01-28 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883878#comment-13883878
 ] 

Akira AJISAKA commented on HDFS-5297:
-

Thank you for reviewing and committing, [~arpitagarwal]!

bq. There is one broken link in WebHDFS.apt.vm.

Filed HDFS-5844 and attached a patch. Would you review it?

> Fix dead links in HDFS site documents
> -
>
> Key: HDFS-5297
> URL: https://issues.apache.org/jira/browse/HDFS-5297
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HDFS-5297.patch
>
>
> I found a lot of broken hyperlinks in HDFS document to be fixed.
> Ex.)
> In HdfsUserGuide.apt.vm, there is an broken hyperlinks as below
> {noformat}
>For command usage, see {{{dfsadmin}}}.
> {noformat}
> It should be fixed to 
> {noformat}
>For command usage, see 
> {{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5844:


Status: Patch Available  (was: Open)

> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5844) Fix broken link in WebHDFS.apt.vm

2014-01-28 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5844:


Attachment: HDFS-5844.patch

Attaching a patch.

> Fix broken link in WebHDFS.apt.vm
> -
>
> Key: HDFS-5844
> URL: https://issues.apache.org/jira/browse/HDFS-5844
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.2.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5844.patch
>
>
> There is one broken link in WebHDFS.apt.vm.
> {code}
> {{{RemoteException JSON Schema}}}
> {code}
> should be
> {code}
> {{RemoteException JSON Schema}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

84 matches

Mail list logo