[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Status: Open  (was: Patch Available)

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in BlockFixer.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544615#comment-13544615
 ] 

Hadoop QA commented on HDFS-4360:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563403/HDFS-4360.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3744//console

This message is automatically generated.

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in BlockFixer.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Description: 
current implementation can only run single BlockFixer since the fsck (in 
RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
BlockFixer will do the same thing and try to fix same file if multiple 
BlockFixer launched. 

the change/fix will be mainly in BlockFixer.java and 
RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
defined in separated Raid.xml for single RaidNode/BlockFixer

  was:
current implementation can only run single BlockFixer since the fsck (in 
RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
BlockFixer will do the same thing and try to fix same file if multiple 
BlockFixer launched. 

the change/fix will be mainly in RaidNode.java and 
RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
defined in separated Raid.xml for single RaidNode/BlockFixer


> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in BlockFixer.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Status: Patch Available  (was: Open)

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in RaidNode.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Attachment: HDFS-4360.patch

first version of patch 

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in RaidNode.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Status: Open  (was: Patch Available)

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
> Attachments: HDFS-4360.patch
>
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in RaidNode.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2013-01-04 Thread Jun Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Jin updated HDFS-4360:
--

Status: Patch Available  (was: Open)

> multiple BlockFixer should be supported in order to improve scalability and 
> reduce too much work on single BlockFixer
> -
>
> Key: HDFS-4360
> URL: https://issues.apache.org/jira/browse/HDFS-4360
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Jun Jin
>  Labels: patch
>
> current implementation can only run single BlockFixer since the fsck (in 
> RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
> BlockFixer will do the same thing and try to fix same file if multiple 
> BlockFixer launched. 
> the change/fix will be mainly in RaidNode.java and 
> RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
> defined in separated Raid.xml for single RaidNode/BlockFixer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544577#comment-13544577
 ] 

Hadoop QA commented on HDFS-4306:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563391/HDFS-4306.v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3741//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3741//console

This message is automatically generated.

> PBHelper.convertLocatedBlock miss convert BlockToken
> 
>
> Key: HDFS-4306
> URL: https://issues.apache.org/jira/browse/HDFS-4306
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, 
> HDFS-4306.v4.patch
>
>
> PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss 
> convert BlockToken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544568#comment-13544568
 ] 

Hadoop QA commented on HDFS-4353:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563386/02e.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3740//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3740//console

This message is automatically generated.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch, 02d.patch, 02e.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4362:
--

Status: Patch Available  (was: Open)

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Critical
> Attachments: HDFS-4362.patch
>
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4362:
--

Attachment: HDFS-4362.patch

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Critical
> Attachments: HDFS-4362.patch
>
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-4362:
-

Assignee: Suresh Srinivas

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Critical
> Attachments: HDFS-4362.patch
>
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode

2013-01-04 Thread liuyang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544564#comment-13544564
 ] 

liuyang commented on HDFS-4295:
---

Thanks Aaron for this workaround.

1. calls the SecureDataNodeStarter.init() method while running as root;
2. then calls SecureDataNodeStarter.start() method while running as hdfs;

how to execute the script for statrting the datanode?

> Using port 1023 should be valid when starting Secure DataNode
> -
>
> Key: HDFS-4295
> URL: https://issues.apache.org/jira/browse/HDFS-4295
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>  Labels: trivial
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4295.patch
>
>
> In SecureDataNodeStarter:
> {code}
> if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) &&
> UserGroupInformation.isSecurityEnabled()) {
>   throw new RuntimeException("Cannot start secure datanode with 
> unprivileged ports");
> }
> {code}
> This prohibits using port 1023, but this should be okay because only root can 
> listen to ports below 1024.
> We can change the >= to >.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4362:
--

Affects Version/s: 2.0.2-alpha

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Suresh Srinivas
>Priority: Critical
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4362:
--

Hadoop Flags: Incompatible change

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Suresh Srinivas
>Priority: Critical
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token

2013-01-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas moved YARN-317 to HDFS-4362:


Issue Type: Bug  (was: Improvement)
   Key: HDFS-4362  (was: YARN-317)
   Project: Hadoop HDFS  (was: Hadoop YARN)

> GetDelegationTokenResponseProto does not handle null token
> --
>
> Key: HDFS-4362
> URL: https://issues.apache.org/jira/browse/HDFS-4362
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Suresh Srinivas
>Priority: Critical
>
> While working on HADOOP-9173, I notice that the 
> GetDelegationTokenResponseProto declares the token field as required. However 
> return of null token is to be expected both as defined in 
> FileSystem#getDelegationToken() and also based on HDFS implementation. This 
> jira intends to make the field as optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken

2013-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544548#comment-13544548
 ] 

Aaron T. Myers commented on HDFS-4306:
--

The latest patch looks good to me. +1 pending Jenkins.

> PBHelper.convertLocatedBlock miss convert BlockToken
> 
>
> Key: HDFS-4306
> URL: https://issues.apache.org/jira/browse/HDFS-4306
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, 
> HDFS-4306.v4.patch
>
>
> PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss 
> convert BlockToken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken

2013-01-04 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HDFS-4306:


Attachment: HDFS-4306.v4.patch

Thanks for noticing this. Correct the bug mentioned in last comment.


> PBHelper.convertLocatedBlock miss convert BlockToken
> 
>
> Key: HDFS-4306
> URL: https://issues.apache.org/jira/browse/HDFS-4306
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, 
> HDFS-4306.v4.patch
>
>
> PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss 
> convert BlockToken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4333) Using right default value for creating files in HDFS

2013-01-04 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544540#comment-13544540
 ] 

Binglin Chang commented on HDFS-4333:
-

bq. I'm re-classifying this as an improvement, since it seems to be purely a 
style improvement.
This makes sense, I will submit a patch after HADOOP-9155 is done. 


> Using right default value for creating files in HDFS
> 
>
> Key: HDFS-4333
> URL: https://issues.apache.org/jira/browse/HDFS-4333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.2-alpha
>Reporter: Binglin Chang
>Assignee: Binglin Chang
>Priority: Minor
>
> The default permission to create file should be 0666 rather than 0777, 
> HADOOP-9155 add default permission for files and change 
> localfilesystem.create to use this default value, this jira makes the similar 
> change with hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode

2013-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544539#comment-13544539
 ] 

Aaron T. Myers commented on HDFS-4295:
--

You need to be root in order to bind to low ports, jsvc doesn't have anything 
to do with that. The DN uses jsvc so that it can start as root, bind to the low 
port, and then switch users to hdfs for the rest of its run. So, you need to 
start the DN as root when enabling security and running with jsvc - no way 
around that.

> Using port 1023 should be valid when starting Secure DataNode
> -
>
> Key: HDFS-4295
> URL: https://issues.apache.org/jira/browse/HDFS-4295
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>  Labels: trivial
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4295.patch
>
>
> In SecureDataNodeStarter:
> {code}
> if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) &&
> UserGroupInformation.isSecurityEnabled()) {
>   throw new RuntimeException("Cannot start secure datanode with 
> unprivileged ports");
> }
> {code}
> This prohibits using port 1023, but this should be okay because only root can 
> listen to ports below 1024.
> We can change the >= to >.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode

2013-01-04 Thread liuyang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544538#comment-13544538
 ] 

liuyang commented on HDFS-4295:
---

The jsvc program is used to start the DataNode listening on low port numbers, 
but DataNode cannot be started while running as no root user.
The exception as follow:
  Initializing secure datanode resources
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:164)
Caused by: java.net.SocketException: Permission denied
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:76)
... 5 more
Cannot load daemon

anything I missed? 

> Using port 1023 should be valid when starting Secure DataNode
> -
>
> Key: HDFS-4295
> URL: https://issues.apache.org/jira/browse/HDFS-4295
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>  Labels: trivial
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4295.patch
>
>
> In SecureDataNodeStarter:
> {code}
> if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) &&
> UserGroupInformation.isSecurityEnabled()) {
>   throw new RuntimeException("Cannot start secure datanode with 
> unprivileged ports");
> }
> {code}
> This prohibits using port 1023, but this should be okay because only root can 
> listen to ports below 1024.
> We can change the >= to >.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544511#comment-13544511
 ] 

Hadoop QA commented on HDFS-4353:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563377/02d.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3738//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3738//console

This message is automatically generated.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch, 02d.patch, 02e.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4353:
---

Attachment: 02e.patch

remove some parameters that aren't necessary now that we aren't tracking Peers 
in PeerServer.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch, 02d.patch, 02e.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4304) Make FSEditLogOp.MAX_OP_SIZE configurable

2013-01-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544489#comment-13544489
 ] 

Colin Patrick McCabe commented on HDFS-4304:


Last comment should read "the JN that is actually going to be writing the bytes 
to disk."

> Make FSEditLogOp.MAX_OP_SIZE configurable
> -
>
> Key: HDFS-4304
> URL: https://issues.apache.org/jira/browse/HDFS-4304
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Todd Lipcon
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-4304.001.patch, HDFS-4304.002.patch, 
> HDFS-4304.003.patch, HDFS-4304.004.patch, HDFS-4304.005.patch
>
>
> Today we ran into an issue where a NN had logged a very large op, greater 
> than the 1.5MB MAX_OP_SIZE constant. In order to successfully load the edits, 
> we had to patch with a larger constant. This constant should be configurable 
> so that we wouldn't have to recompile in these odd cases. Additionally, I 
> think the default should be bumped a bit higher, since it's only a safeguard 
> against OOME, and people tend to run NNs with multi-GB heaps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-3371) EditLogFileInputStream: be more careful about closing streams when we're done with them.

2013-01-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-3371.


Resolution: Won't Fix

I guess there's no point in doing this one as the outer stream will close all 
the inner ones.

> EditLogFileInputStream: be more careful about closing streams when we're done 
> with them.
> 
>
> Key: HDFS-3371
> URL: https://issues.apache.org/jira/browse/HDFS-3371
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3371.001.patch, HDFS-3371.002.patch
>
>
> EditLogFileInputStream#EditLogFileInputStream should be more careful about 
> closing streams when there is an exception thrown.  Also, 
> EditLogFileInputStream#close should close all of the streams we opened in the 
> constructor, not just one of them (although the file-backed one is probably 
> the most important).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4304) Make FSEditLogOp.MAX_OP_SIZE configurable

2013-01-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4304:
---

Attachment: HDFS-4304.005.patch

This version of the patch makes MAX_OP_SIZE configurable in production and not 
just in recovery mode.

I didn't implement the warning when writing an over-long opcode.  It would be 
really tricky to do this "right"-- for example, if you're using QJM, the 
maximum op size on your local NameNode may not be the same as on the NN that is 
actually going to be writing the bytes to disk.  I think that would get messy.  
This is just a minimal change to make something which wasn't configurable 
before, configurable.

> Make FSEditLogOp.MAX_OP_SIZE configurable
> -
>
> Key: HDFS-4304
> URL: https://issues.apache.org/jira/browse/HDFS-4304
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Todd Lipcon
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-4304.001.patch, HDFS-4304.002.patch, 
> HDFS-4304.003.patch, HDFS-4304.004.patch, HDFS-4304.005.patch
>
>
> Today we ran into an issue where a NN had logged a very large op, greater 
> than the 1.5MB MAX_OP_SIZE constant. In order to successfully load the edits, 
> we had to patch with a larger constant. This constant should be configurable 
> so that we wouldn't have to recompile in these odd cases. Additionally, I 
> think the default should be bumped a bit higher, since it's only a safeguard 
> against OOME, and people tend to run NNs with multi-GB heaps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544347#comment-13544347
 ] 

Colin Patrick McCabe commented on HDFS-4352:


oops... hit enter too early.  To continue, I added these improvements in the 
patch for HDFS-4353.  Let me know if this addresses your post-commit comment.

> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544346#comment-13544346
 ] 

Colin Patrick McCabe commented on HDFS-4352:


Hi Nicholas,

I added asserts to let you know if you did not set a mandatory parameter, and 
also JavaDoc documentation 

> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4353:
---

Attachment: 02d.patch

This version addresses Todd's comment about removing tracking of {{Peer}} 
objects from {{PeerServer}} (since we might as well continue doing it in 
{{DataXceiver}}).

It also addresses some comments Nicholas made in HDFS-4352 about making it 
clear which parameters to {{BlockReaderFactory#newBlockReader}} are mandatory 
or not.  The function now {{asserts}} if a mandatory parameter is not set.  
Also added a bunch of documentation about what the parameters do.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch, 02d.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544292#comment-13544292
 ] 

Hadoop QA commented on HDFS-4353:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563368/02c.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3737//console

This message is automatically generated.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4353:
---

Attachment: 02c.patch

resubmitting because the previous build failed for an unrelated reason.  The 
error message was:

{code}
[exec] /usr/bin/ld: cannot find -lstdc++
{code}

which is not a problem with this patch

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, 
> 02-cumulative.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3970) BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead of DataStorage to read prev version file.

2013-01-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544267#comment-13544267
 ] 

Todd Lipcon commented on HDFS-3970:
---

thanks for the test, Andrew. +1 pending Jenkins

> BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead 
> of DataStorage to read prev version file.
> ---
>
> Key: HDFS-3970
> URL: https://issues.apache.org/jira/browse/HDFS-3970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Vinay
>Assignee: Vinay
> Attachments: hdfs-3970-1.patch, HDFS-3970.patch
>
>
> {code}// read attributes out of the VERSION file of previous directory
> DataStorage prevInfo = new DataStorage();
> prevInfo.readPreviousVersionProperties(bpSd);{code}
> In the above code snippet BlockPoolSliceStorage instance should be used. 
> other wise rollback results in 'storageType' property missing which will not 
> be there in initial VERSION file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4230) Listing all the current snapshottable directories

2013-01-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4230:


Attachment: HDFS-4230.004.patch

New patch uploaded which puts HdfsFileStatus as a field of 
SnapshottableDirectoryStatus to address Nicholas's comments.

> Listing all the current snapshottable directories
> -
>
> Key: HDFS-4230
> URL: https://issues.apache.org/jira/browse/HDFS-4230
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, 
> HDFS-4230.002.patch, HDFS-4230.003.patch, HDFS-4230.004.patch
>
>
> Provide functionality to provide user with metadata about all the 
> snapshottable directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544189#comment-13544189
 ] 

Hadoop QA commented on HDFS-4353:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563247/02c.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3734//console

This message is automatically generated.

> Encapsulate connections to peers in Peer and PeerServer classes
> ---
>
> Key: HDFS-4353
> URL: https://issues.apache.org/jira/browse/HDFS-4353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: 02b-cumulative.patch, 02c.patch, 02-cumulative.patch
>
>
> Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} 
> classes.  Since many Java classes may be involved with these connections, it 
> makes sense to create a container for them.  For example, a connection to a 
> peer may have an input stream, output stream, readablebytechannel, encrypted 
> output stream, and encrypted input stream associated with it.
> This makes us less dependent on the {{NetUtils}} methods which use 
> {{instanceof}} to manipulate socket and stream states based on the runtime 
> type.  it also paves the way to introduce UNIX domain sockets which don't 
> inherit from {{java.net.Socket}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544178#comment-13544178
 ] 

Hadoop QA commented on HDFS-4253:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563303/hdfs4253-5.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3733//console

This message is automatically generated.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, 
> hdfs4253-4.txt, hdfs4253-5.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-4350:
--

Attachment: hdfs-4350-1.patch

> Make enabling of stale marking on read and write paths independent
> --
>
> Key: HDFS-4350
> URL: https://issues.apache.org/jira/browse/HDFS-4350
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-4350-1.patch
>
>
> Marking of datanodes as stale for the read and write path was introduced in 
> HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
> {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
> {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
> exists a dependency, since you cannot enable write marking without also 
> enabling read marking, since the first key enables both checking of staleness 
> and read marking.
> I propose renaming the first key to 
> {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
> if either of the keys are set. This will allow read and write marking to be 
> enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-4350:
--

Attachment: (was: hdfs-4350-1.patch)

> Make enabling of stale marking on read and write paths independent
> --
>
> Key: HDFS-4350
> URL: https://issues.apache.org/jira/browse/HDFS-4350
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-4350-1.patch
>
>
> Marking of datanodes as stale for the read and write path was introduced in 
> HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
> {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
> {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
> exists a dependency, since you cannot enable write marking without also 
> enabling read marking, since the first key enables both checking of staleness 
> and read marking.
> I propose renaming the first key to 
> {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
> if either of the keys are set. This will allow read and write marking to be 
> enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544162#comment-13544162
 ] 

Colin Patrick McCabe commented on HDFS-4352:


bq. I may miss something. This seems making the code much more confusing: how 
does the caller determine which parameters to set before passing 
BlockReaderFactory.Params? For example, which methods require ioStreamPair and 
which methods do not?

{{ioStreamPair}} is never required, but if it is present it will be used.  I 
agree that this is confusing, but it was like that before-- you just couldn't 
see it because the parameter list was so long!  Most callers passed null for 
this-- by not setting it, they get that automatically now.

By the way, HDFS-4353 gets rid of some {{ioStreamPair}} and combines together 
the {{Socket}} and the {{ioStreamPair}} in a class called {{Peer}}.  So 
hopefully this will make things less confusing by reducing the number of 
parameters that have to be passed.

The general idea behind {{Params}} is that if you don't set a parameter, it 
just becomes some reasonable default.  The only mandatory members are 
{{Socket}}, {{Block}}, and {{Conf}}.  I suppose we could add those 3 parameters 
to the {{Params}} constructor if that seemed clearer.  We also should document 
parameters more fully.

> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4244) Support deleting snapshots

2013-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544158#comment-13544158
 ] 

Aaron T. Myers commented on HDFS-4244:
--

bq. The new patch addressed your comments 1, 2, and 4.

Thanks, Jing. The updated patch looks good in this regard.

bq. I will file a separate jira to handle all the DFSAdmin command related 
issues.

Cool. Please let me know when you do.

The latest patch looks good from my perspective.

> Support deleting snapshots
> --
>
> Key: HDFS-4244
> URL: https://issues.apache.org/jira/browse/HDFS-4244
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4244.001.patch, HDFS-4244.002.patch, 
> HDFS-4244.003.patch, HDFS-4244.004.patch
>
>
> Provide functionality to delete a snapshot, given the name of the snapshot 
> and the path to the directory where the snapshot was taken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4351) Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes

2013-01-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544147#comment-13544147
 ] 

Jing Zhao commented on HDFS-4351:
-

I also run test-patch for Andrew, and the result looks good:

{noformat}
-1 overall.  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 225 new Findbugs (version 
2.0.1) warnings.
{noformat}

> Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes
> --
>
> Key: HDFS-4351
> URL: https://issues.apache.org/jira/browse/HDFS-4351
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-4351-2.patch, hdfs-4351-3.patch, hdfs-4351-4.patch, 
> hdfs-4351-branch-1-1.patch, hdfs-4351.patch
>
>
> There's a bug in {{BlockPlacementPolicyDefault#chooseTarget}} with stale node 
> avoidance enabled (HDFS-3912). If a NotEnoughReplicasException is thrown in 
> the call to {{chooseRandom()}}, {{numOfReplicas}} is not updated together 
> with the partial result in {{result}} since it is pass by value. The retry 
> call to {{chooseTarget}} then uses this incorrect value.
> This can be seen if you enable stale node detection for 
> {{TestReplicationPolicy#testChooseTargetWithMoreThanAvaiableNodes()}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken

2013-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544137#comment-13544137
 ] 

Aaron T. Myers commented on HDFS-4306:
--

Thanks for refactoring it, Binglin, but I think there's a little bug in your 
new compare method. Seems like this loop should be comparing the values from 
the two different arrays, {{ei}} and {{ai}}:
{code}
+for (int i = 0; i < ei.length ; i++) {
+  compare(ai[i], ai[i]);
+}
{code}

Otherwise the patch looks good to me.

> PBHelper.convertLocatedBlock miss convert BlockToken
> 
>
> Key: HDFS-4306
> URL: https://issues.apache.org/jira/browse/HDFS-4306
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch
>
>
> PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss 
> convert BlockToken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4256) Support for concatenation of files into a single file in branch-1

2013-01-04 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544131#comment-13544131
 ] 

Suresh Srinivas commented on HDFS-4256:
---

+1 for the patch.

> Support for concatenation of files into a single file in branch-1
> -
>
> Key: HDFS-4256
> URL: https://issues.apache.org/jira/browse/HDFS-4256
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: namenode
>Affects Versions: 1.2.0
>Reporter: Suresh Srinivas
>Assignee: Sanjay Radia
> Attachments: HDFS-4256-2.patch, HDFS-4256.patch
>
>
> HDFS-222 added support concatenation of multiple files in a directory into a 
> single file. This helps several use cases where writes can be parallelized 
> and several folks have expressed in this functionality.
> This jira intends to make changes equivalent from HDFS-222 into branch-1 to 
> be made available release 1.2.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4230) Listing all the current snapshottable directories

2013-01-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544092#comment-13544092
 ] 

Jing Zhao commented on HDFS-4230:
-

bq. Sure, here or separate JIRA doesn't matter to me. Would you like to file it 
or shall I?

I filed HDFS-4361 for this.

> Listing all the current snapshottable directories
> -
>
> Key: HDFS-4230
> URL: https://issues.apache.org/jira/browse/HDFS-4230
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, 
> HDFS-4230.002.patch, HDFS-4230.003.patch
>
>
> Provide functionality to provide user with metadata about all the 
> snapshottable directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4361) When listing snapshottable directories, only return those where the user has permission to take snapshots

2013-01-04 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-4361:
---

 Summary: When listing snapshottable directories, only return those 
where the user has permission to take snapshots
 Key: HDFS-4361
 URL: https://issues.apache.org/jira/browse/HDFS-4361
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4351) Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes

2013-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544090#comment-13544090
 ] 

Andrew Wang commented on HDFS-4351:
---

Just wanted to note that I fixed my branch-1 env and the test passes.

> Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes
> --
>
> Key: HDFS-4351
> URL: https://issues.apache.org/jira/browse/HDFS-4351
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-4351-2.patch, hdfs-4351-3.patch, hdfs-4351-4.patch, 
> hdfs-4351-branch-1-1.patch, hdfs-4351.patch
>
>
> There's a bug in {{BlockPlacementPolicyDefault#chooseTarget}} with stale node 
> avoidance enabled (HDFS-3912). If a NotEnoughReplicasException is thrown in 
> the call to {{chooseRandom()}}, {{numOfReplicas}} is not updated together 
> with the partial result in {{result}} since it is pass by value. The retry 
> call to {{chooseTarget}} then uses this incorrect value.
> This can be seen if you enable stale node detection for 
> {{TestReplicationPolicy#testChooseTargetWithMoreThanAvaiableNodes()}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4230) Listing all the current snapshottable directories

2013-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544083#comment-13544083
 ] 

Aaron T. Myers commented on HDFS-4230:
--

bq. It may be even better to list only the snapshottable directories that the 
user has permission to take snapshots. 

That sounds fine to me as well.

bq. Let's think about it carefully and work on it separately.

Sure, here or separate JIRA doesn't matter to me. Would you like to file it or 
shall I?

> Listing all the current snapshottable directories
> -
>
> Key: HDFS-4230
> URL: https://issues.apache.org/jira/browse/HDFS-4230
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, 
> HDFS-4230.002.patch, HDFS-4230.003.patch
>
>
> Provide functionality to provide user with metadata about all the 
> snapshottable directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3970) BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead of DataStorage to read prev version file.

2013-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3970:
--

Attachment: hdfs-3970-1.patch

I added a test to Vinay's patch which attempts to rollback a block pool. Tested 
by running the test before and after the fix to confirm.

> BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead 
> of DataStorage to read prev version file.
> ---
>
> Key: HDFS-3970
> URL: https://issues.apache.org/jira/browse/HDFS-3970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Vinay
>Assignee: Vinay
> Attachments: hdfs-3970-1.patch, HDFS-3970.patch
>
>
> {code}// read attributes out of the VERSION file of previous directory
> DataStorage prevInfo = new DataStorage();
> prevInfo.readPreviousVersionProperties(bpSd);{code}
> In the above code snippet BlockPoolSliceStorage instance should be used. 
> other wise rollback results in 'storageType' property missing which will not 
> be there in initial VERSION file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4278) The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on when security is enabled.

2013-01-04 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544062#comment-13544062
 ] 

Eli Collins commented on HDFS-4278:
---

Or we should ERROR loudly if security is enabled and this config is not set if 
that's easier.

> The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on 
> when security is enabled.
> 
>
> Key: HDFS-4278
> URL: https://issues.apache.org/jira/browse/HDFS-4278
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>  Labels: newbie
>
> When enabling security, one has to manually enable the config 
> DFS_BLOCK_ACCESS_TOKEN_ENABLE (dfs.block.access.token.enable). Since these 
> two are coupled, we could make it turn itself on automatically if we find 
> security to be enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4278) The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on when security is enabled.

2013-01-04 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-4278:
--

Labels: newbie  (was: )

> The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on 
> when security is enabled.
> 
>
> Key: HDFS-4278
> URL: https://issues.apache.org/jira/browse/HDFS-4278
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>  Labels: newbie
>
> When enabling security, one has to manually enable the config 
> DFS_BLOCK_ACCESS_TOKEN_ENABLE (dfs.block.access.token.enable). Since these 
> two are coupled, we could make it turn itself on automatically if we find 
> security to be enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2013-01-04 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-4253:


Attachment: hdfs4253-5.txt

New patch implementing cmccabe's suggestion to presume stable search and 
shuffle first.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, 
> hdfs4253-4.txt, hdfs4253-5.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2013-01-04 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543950#comment-13543950
 ] 

Andy Isaacson commented on HDFS-4253:
-

bq. Just shuffle the array and then call Arrays.sort with your custom 
comparator. Since Arrays.sort is a stable sort (doesn't re-order equal 
elements) the randomization will still be there.

OK, I think I see your point after squinting at it for a while.  I would find 
that code quite impenetrable and surprising without a comment explaining why, 
though.  And I've taken a shot at writing a good comment and not succeeded.  
Could you suggest wording to explain the subtle approach you've suggested, so 
that others can understand this code in the future?

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, 
> hdfs4253-4.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated HDFS-4270:


Fix Version/s: 0.23.6

I pulled this into branch-0.23

> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.6
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543905#comment-13543905
 ] 

Hudson commented on HDFS-4270:
--

Integrated in Hadoop-Mapreduce-trunk #1305 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/])
HDFS-4270. Introduce soft and hard limits for max replication so that 
replications of the highest priority are allowed to choose a source datanode 
that has reached its soft limit but not the hard limit.  Contributed by Derek 
Dagit (Revision 1428739)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java


> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543904#comment-13543904
 ] 

Hudson commented on HDFS-4302:
--

Integrated in Hadoop-Mapreduce-trunk #1305 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/])
HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. 
Contributed by Eugene Koontz. (Revision 1428590)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Precondition in EditLogFileInputStream's length() method is checked too early 
> in NameNode startup, causing fatal exception
> --
>
> Key: HDFS-4302
> URL: https://issues.apache.org/jira/browse/HDFS-4302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Eugene Koontz
>Assignee: Eugene Koontz
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4302.patch, HDFS-4302.patch
>
>
> When bringing up a namenode in standby mode, where DEBUG is enabled for 
> namenode, the namenode will hit the following code in 
> {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}:
> {code}
>  if (LOG.isDebugEnabled()) {
>   LOG.debug("edit log length: " + in.length() + ", start txid: "
>   + expectedStartingTxId + ", last txid: " + lastTxId);
> }
> {code}.
> However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, 
> this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} 
> is initialized (before the HTTP client connects to the remote edit log server 
> (i.e. the journal node)). This causes the following precondition to fail in 
> {{EditLogFileInputStream:length()}}:
> {code}
>   Preconditions.checkState(advertisedSize != -1,
>   "must get input stream before length is available");
> {code}
> which shuts down the namenode with the following log messages and stack trace:
> {code}
> 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine 
> (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms
> 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager 
> (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream 
> manifests:
> 172.16.175.1:8485: [[1,3]]
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(605)) - Planning to load image :
> FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000,
>  cpktTxId=000)
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) 
> - Loading image file 
> /tmp/hadoop-data/dfs/name/current/fsimage_000 using no 
> compression
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) 
> - Number of files = 1
> 2012-12-11 10:45:33,356 INFO  namenode.FSImage 
> (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under 
> construction = 0
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) 
> - Image file of size 119 loaded in 0 seconds.
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage 
> (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from 
> /tmp/hadoop-data/dfs/name/current/fsimage_000
> 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) 
> - About to load edits:
>   org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) 
> - Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 
> expecting start txid #1
> 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping 
> client
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC 
> Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 
> from hdfs/eugenes-macbook-pro.lo...@example.com: closed
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client 
> (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from 
> hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0
> 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - 
> Exception in namenode join
> java.la

[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543900#comment-13543900
 ] 

Hudson commented on HDFS-4346:
--

Integrated in Hadoop-Mapreduce-trunk #1305 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/])
Add file which was accidentally missed during commit of HDFS-4346. 
(Revision 1428560)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java


> Refactor INodeId and GenerationStamp
> 
>
> Key: HDFS-4346
> URL: https://issues.apache.org/jira/browse/HDFS-4346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h4346_20121231.patch, h4346_20130101.patch, 
> h4346_20130102.patch
>
>
> The INodeId and GenerationStamp classes are very similar.  It is better to 
> refactor them for code sharing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543896#comment-13543896
 ] 

Hudson commented on HDFS-4352:
--

Integrated in Hadoop-Mapreduce-trunk #1305 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/])
HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. 
Contributed by Colin Patrick McCabe. (Revision 1428729)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java


> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543871#comment-13543871
 ] 

Hudson commented on HDFS-4357:
--

Integrated in Hadoop-Hdfs-Snapshots-Branch-build #60 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/60/])
HDFS-4357. Fix a bug that if an inode is replaced, further INode operations 
should apply to the new inode. Contributed by Jing Zhao (Revision 1428780)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428780
Files : 
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java


> After calling replaceSelf, further operations that should be applied on the 
> new INode may be wrongly applied to the original INode
> --
>
> Key: HDFS-4357
> URL: https://issues.apache.org/jira/browse/HDFS-4357
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, 
> HDFS-4357.003.patch
>
>
> An example is in INode#setModificationTime, if the INode is an instance of 
> INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, 
> the change of the modification time should happen in the new 
> INodeDirectoryWithSnapshot instead of the original INodeDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543862#comment-13543862
 ] 

Hudson commented on HDFS-4270:
--

Integrated in Hadoop-Hdfs-trunk #1275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/])
HDFS-4270. Introduce soft and hard limits for max replication so that 
replications of the highest priority are allowed to choose a source datanode 
that has reached its soft limit but not the hard limit.  Contributed by Derek 
Dagit (Revision 1428739)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java


> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543861#comment-13543861
 ] 

Hudson commented on HDFS-4302:
--

Integrated in Hadoop-Hdfs-trunk #1275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/])
HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. 
Contributed by Eugene Koontz. (Revision 1428590)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Precondition in EditLogFileInputStream's length() method is checked too early 
> in NameNode startup, causing fatal exception
> --
>
> Key: HDFS-4302
> URL: https://issues.apache.org/jira/browse/HDFS-4302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Eugene Koontz
>Assignee: Eugene Koontz
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4302.patch, HDFS-4302.patch
>
>
> When bringing up a namenode in standby mode, where DEBUG is enabled for 
> namenode, the namenode will hit the following code in 
> {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}:
> {code}
>  if (LOG.isDebugEnabled()) {
>   LOG.debug("edit log length: " + in.length() + ", start txid: "
>   + expectedStartingTxId + ", last txid: " + lastTxId);
> }
> {code}.
> However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, 
> this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} 
> is initialized (before the HTTP client connects to the remote edit log server 
> (i.e. the journal node)). This causes the following precondition to fail in 
> {{EditLogFileInputStream:length()}}:
> {code}
>   Preconditions.checkState(advertisedSize != -1,
>   "must get input stream before length is available");
> {code}
> which shuts down the namenode with the following log messages and stack trace:
> {code}
> 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine 
> (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms
> 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager 
> (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream 
> manifests:
> 172.16.175.1:8485: [[1,3]]
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(605)) - Planning to load image :
> FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000,
>  cpktTxId=000)
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) 
> - Loading image file 
> /tmp/hadoop-data/dfs/name/current/fsimage_000 using no 
> compression
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) 
> - Number of files = 1
> 2012-12-11 10:45:33,356 INFO  namenode.FSImage 
> (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under 
> construction = 0
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) 
> - Image file of size 119 loaded in 0 seconds.
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage 
> (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from 
> /tmp/hadoop-data/dfs/name/current/fsimage_000
> 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) 
> - About to load edits:
>   org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) 
> - Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 
> expecting start txid #1
> 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping 
> client
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC 
> Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 
> from hdfs/eugenes-macbook-pro.lo...@example.com: closed
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client 
> (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from 
> hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0
> 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - 
> Exception in namenode join
> java.lang.Illegal

[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543856#comment-13543856
 ] 

Hudson commented on HDFS-4346:
--

Integrated in Hadoop-Hdfs-trunk #1275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/])
Add file which was accidentally missed during commit of HDFS-4346. 
(Revision 1428560)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java


> Refactor INodeId and GenerationStamp
> 
>
> Key: HDFS-4346
> URL: https://issues.apache.org/jira/browse/HDFS-4346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h4346_20121231.patch, h4346_20130101.patch, 
> h4346_20130102.patch
>
>
> The INodeId and GenerationStamp classes are very similar.  It is better to 
> refactor them for code sharing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543852#comment-13543852
 ] 

Hudson commented on HDFS-4352:
--

Integrated in Hadoop-Hdfs-trunk #1275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/])
HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. 
Contributed by Colin Patrick McCabe. (Revision 1428729)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java


> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4227) Document dfs.namenode.resource.*

2013-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543814#comment-13543814
 ] 

Hadoop QA commented on HDFS-4227:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563251/hdfs-4227-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestLeaseRecovery2
  org.apache.hadoop.hdfs.TestFileAppend2
  org.apache.hadoop.hdfs.TestFileAppend3
  org.apache.hadoop.hdfs.TestDFSRemove
  org.apache.hadoop.hdfs.security.TestDelegationToken
  org.apache.hadoop.hdfs.qjournal.server.TestJournalNode

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3730//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3730//console

This message is automatically generated.

> Document dfs.namenode.resource.*  
> --
>
> Key: HDFS-4227
> URL: https://issues.apache.org/jira/browse/HDFS-4227
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Daisuke Kobayashi
>  Labels: newbie
> Attachments: hdfs-4227-1.patch, hdfs-4227-2.patch, HDFS-4227.patch
>
>
> Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section 
> the in the HDFS docs that covers local directories.
> {{dfs.namenode.resource.check.interval}} - the interval in ms at which the 
> NameNode resource checker runs (default is 5000)
> {{dfs.namenode.resource.du.reserved}} - the amount of space to 
> reserve/require for a NN storage directory (default is 100mb)
> {{dfs.namenode.resource.checked.volumes}} - a list of local directories for 
> the NN resource checker to check in addition to the local edits directories 
> (default is empty).
> {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of 
> redundant NN storage volumes required (default is 1). If no redundant 
> resources are available we don't enter SM if there are sufficient required 
> resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode

2013-01-04 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-4357.
--

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)

I have committed this.  Thanks, Jing!

> After calling replaceSelf, further operations that should be applied on the 
> new INode may be wrongly applied to the original INode
> --
>
> Key: HDFS-4357
> URL: https://issues.apache.org/jira/browse/HDFS-4357
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, 
> HDFS-4357.003.patch
>
>
> An example is in INode#setModificationTime, if the INode is an instance of 
> INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, 
> the change of the modification time should happen in the new 
> INodeDirectoryWithSnapshot instead of the original INodeDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode

2013-01-04 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-4357:
-

 Component/s: (was: datanode)
Hadoop Flags: Reviewed

+1 patch looks good.

> After calling replaceSelf, further operations that should be applied on the 
> new INode may be wrongly applied to the original INode
> --
>
> Key: HDFS-4357
> URL: https://issues.apache.org/jira/browse/HDFS-4357
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, 
> HDFS-4357.003.patch
>
>
> An example is in INode#setModificationTime, if the INode is an instance of 
> INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, 
> the change of the modification time should happen in the new 
> INodeDirectoryWithSnapshot instead of the original INodeDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543771#comment-13543771
 ] 

Hudson commented on HDFS-4270:
--

Integrated in Hadoop-Yarn-trunk #86 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/86/])
HDFS-4270. Introduce soft and hard limits for max replication so that 
replications of the highest priority are allowed to choose a source datanode 
that has reached its soft limit but not the hard limit.  Contributed by Derek 
Dagit (Revision 1428739)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java


> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543770#comment-13543770
 ] 

Hudson commented on HDFS-4302:
--

Integrated in Hadoop-Yarn-trunk #86 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/86/])
HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. 
Contributed by Eugene Koontz. (Revision 1428590)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Precondition in EditLogFileInputStream's length() method is checked too early 
> in NameNode startup, causing fatal exception
> --
>
> Key: HDFS-4302
> URL: https://issues.apache.org/jira/browse/HDFS-4302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Eugene Koontz
>Assignee: Eugene Koontz
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4302.patch, HDFS-4302.patch
>
>
> When bringing up a namenode in standby mode, where DEBUG is enabled for 
> namenode, the namenode will hit the following code in 
> {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}:
> {code}
>  if (LOG.isDebugEnabled()) {
>   LOG.debug("edit log length: " + in.length() + ", start txid: "
>   + expectedStartingTxId + ", last txid: " + lastTxId);
> }
> {code}.
> However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, 
> this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} 
> is initialized (before the HTTP client connects to the remote edit log server 
> (i.e. the journal node)). This causes the following precondition to fail in 
> {{EditLogFileInputStream:length()}}:
> {code}
>   Preconditions.checkState(advertisedSize != -1,
>   "must get input stream before length is available");
> {code}
> which shuts down the namenode with the following log messages and stack trace:
> {code}
> 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine 
> (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms
> 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager 
> (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream 
> manifests:
> 172.16.175.1:8485: [[1,3]]
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(605)) - Planning to load image :
> FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000,
>  cpktTxId=000)
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) 
> - Loading image file 
> /tmp/hadoop-data/dfs/name/current/fsimage_000 using no 
> compression
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) 
> - Number of files = 1
> 2012-12-11 10:45:33,356 INFO  namenode.FSImage 
> (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under 
> construction = 0
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) 
> - Image file of size 119 loaded in 0 seconds.
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage 
> (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from 
> /tmp/hadoop-data/dfs/name/current/fsimage_000
> 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) 
> - About to load edits:
>   org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) 
> - Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 
> expecting start txid #1
> 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping 
> client
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC 
> Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 
> from hdfs/eugenes-macbook-pro.lo...@example.com: closed
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client 
> (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from 
> hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0
> 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - 
> Exception in namenode join
> java.lang.IllegalStat

[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543765#comment-13543765
 ] 

Hudson commented on HDFS-4346:
--

Integrated in Hadoop-Yarn-trunk #86 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/86/])
Add file which was accidentally missed during commit of HDFS-4346. 
(Revision 1428560)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java


> Refactor INodeId and GenerationStamp
> 
>
> Key: HDFS-4346
> URL: https://issues.apache.org/jira/browse/HDFS-4346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h4346_20121231.patch, h4346_20130101.patch, 
> h4346_20130102.patch
>
>
> The INodeId and GenerationStamp classes are very similar.  It is better to 
> refactor them for code sharing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543761#comment-13543761
 ] 

Hudson commented on HDFS-4352:
--

Integrated in Hadoop-Yarn-trunk #86 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/86/])
HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. 
Contributed by Colin Patrick McCabe. (Revision 1428729)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java


> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class

2013-01-04 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543752#comment-13543752
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4352:
--

I may miss something.  This seems making the code much more confusing: how does 
the caller determine which parameters to set before passing 
BlockReaderFactory.Params?  For example, which methods require ioStreamPair and 
which methods do not?

> Encapsulate arguments to BlockReaderFactory in a class
> --
>
> Key: HDFS-4352
> URL: https://issues.apache.org/jira/browse/HDFS-4352
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0
>
> Attachments: 01b.patch, 01.patch
>
>
> Encapsulate the arguments to BlockReaderFactory in a class to avoid having to 
> pass around 10+ arguments to a few different functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode

2013-01-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4357:


Attachment: HDFS-4357.003.patch

Simplified code in the new patch.

> After calling replaceSelf, further operations that should be applied on the 
> new INode may be wrongly applied to the original INode
> --
>
> Key: HDFS-4357
> URL: https://issues.apache.org/jira/browse/HDFS-4357
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, 
> HDFS-4357.003.patch
>
>
> An example is in INode#setModificationTime, if the INode is an instance of 
> INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, 
> the change of the modification time should happen in the new 
> INodeDirectoryWithSnapshot instead of the original INodeDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode

2013-01-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4357:


Attachment: HDFS-4357.002.patch

Integrate new testcases that update metadata of directories into 
TestSnapshot#testSnapshot. Also fix another similar bug for setOwner(..).

> After calling replaceSelf, further operations that should be applied on the 
> new INode may be wrongly applied to the original INode
> --
>
> Key: HDFS-4357
> URL: https://issues.apache.org/jira/browse/HDFS-4357
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch
>
>
> An example is in INode#setModificationTime, if the INode is an instance of 
> INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, 
> the change of the modification time should happen in the new 
> INodeDirectoryWithSnapshot instead of the original INodeDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543695#comment-13543695
 ] 

Hudson commented on HDFS-4270:
--

Integrated in Hadoop-trunk-Commit #3174 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3174/])
HDFS-4270. Introduce soft and hard limits for max replication so that 
replications of the highest priority are allowed to choose a source datanode 
that has reached its soft limit but not the hard limit.  Contributed by Derek 
Dagit (Revision 1428739)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java


> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit

2013-01-04 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-4270:
-

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   3.0.0
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Derek!

> Replications of the highest priority should be allowed to choose a source 
> datanode that has reached its max replication limit
> -
>
> Key: HDFS-4270
> URL: https://issues.apache.org/jira/browse/HDFS-4270
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 0.23.5
>Reporter: Derek Dagit
>Assignee: Derek Dagit
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4270-branch-0.23.patch, 
> HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, 
> HDFS-4270.patch, HDFS-4270.patch
>
>
> Blocks that have been identified as under-replicated are placed on one of 
> several priority queues.  The highest priority queue is essentially reserved 
> for situations in which only one replica of the block exists, meaning it 
> should be replicated ASAP.
> The ReplicationMonitor periodically computes replication work, and a call to 
> BlockManager#chooseUnderReplicatedBlocks selects a given number of 
> under-replicated blocks, choosing blocks from the highest-priority queue 
> first and working down to the lowest priority queue.
> In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
> source for the replication is chosen from among datanodes that have an 
> available copy of the block needed.  This is done in 
> BlockManager#chooseSourceDatanode.
> chooseSourceDatanode's job is to choose the datanode for replication.  It 
> chooses a random datanode from the available datanodes that has not reached 
> its replication limit (preferring datanodes that are currently 
> decommissioning).
> However, the priority queue of the block does not inform the logic.  If a 
> datanode holds the last remaining replica of a block and has already reached 
> its replication limit, the node is dismissed outright and the replication is 
> not scheduled.
> In some situations, this could lead to data loss, as the last remaining 
> replica could disappear if an opportunity is not taken to schedule a 
> replication.  It would be better to waive the max replication limit in cases 
> of highest-priority block replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4227) Document dfs.namenode.resource.*

2013-01-04 Thread Mark Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543687#comment-13543687
 ] 

Mark Yang commented on HDFS-4227:
-

Hi Aaron, thanks for your review.

About the description "If no redundant resources are available we ..", I 
have no idea what it means either after checking the source code.
I just remove this line.

> Document dfs.namenode.resource.*  
> --
>
> Key: HDFS-4227
> URL: https://issues.apache.org/jira/browse/HDFS-4227
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Daisuke Kobayashi
>  Labels: newbie
> Attachments: hdfs-4227-1.patch, hdfs-4227-2.patch, HDFS-4227.patch
>
>
> Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section 
> the in the HDFS docs that covers local directories.
> {{dfs.namenode.resource.check.interval}} - the interval in ms at which the 
> NameNode resource checker runs (default is 5000)
> {{dfs.namenode.resource.du.reserved}} - the amount of space to 
> reserve/require for a NN storage directory (default is 100mb)
> {{dfs.namenode.resource.checked.volumes}} - a list of local directories for 
> the NN resource checker to check in addition to the local edits directories 
> (default is empty).
> {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of 
> redundant NN storage volumes required (default is 1). If no redundant 
> resources are available we don't enter SM if there are sufficient required 
> resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira