[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Status: Open (was: Patch Available) > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in BlockFixer.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544615#comment-13544615 ] Hadoop QA commented on HDFS-4360: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563403/HDFS-4360.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3744//console This message is automatically generated. > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in BlockFixer.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Description: current implementation can only run single BlockFixer since the fsck (in RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple BlockFixer will do the same thing and try to fix same file if multiple BlockFixer launched. the change/fix will be mainly in BlockFixer.java and RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths defined in separated Raid.xml for single RaidNode/BlockFixer was: current implementation can only run single BlockFixer since the fsck (in RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple BlockFixer will do the same thing and try to fix same file if multiple BlockFixer launched. the change/fix will be mainly in RaidNode.java and RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths defined in separated Raid.xml for single RaidNode/BlockFixer > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in BlockFixer.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Status: Patch Available (was: Open) > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in RaidNode.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Attachment: HDFS-4360.patch first version of patch > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in RaidNode.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Status: Open (was: Patch Available) > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > Attachments: HDFS-4360.patch > > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in RaidNode.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4360) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
[ https://issues.apache.org/jira/browse/HDFS-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Jin updated HDFS-4360: -- Status: Patch Available (was: Open) > multiple BlockFixer should be supported in order to improve scalability and > reduce too much work on single BlockFixer > - > > Key: HDFS-4360 > URL: https://issues.apache.org/jira/browse/HDFS-4360 > Project: Hadoop HDFS > Issue Type: Improvement > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Jun Jin > Labels: patch > > current implementation can only run single BlockFixer since the fsck (in > RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple > BlockFixer will do the same thing and try to fix same file if multiple > BlockFixer launched. > the change/fix will be mainly in RaidNode.java and > RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths > defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken
[ https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544577#comment-13544577 ] Hadoop QA commented on HDFS-4306: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563391/HDFS-4306.v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3741//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3741//console This message is automatically generated. > PBHelper.convertLocatedBlock miss convert BlockToken > > > Key: HDFS-4306 > URL: https://issues.apache.org/jira/browse/HDFS-4306 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, > HDFS-4306.v4.patch > > > PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss > convert BlockToken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544568#comment-13544568 ] Hadoop QA commented on HDFS-4353: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563386/02e.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3740//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3740//console This message is automatically generated. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch, 02d.patch, 02e.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4362: -- Status: Patch Available (was: Open) > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas >Priority: Critical > Attachments: HDFS-4362.patch > > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4362: -- Attachment: HDFS-4362.patch > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas >Priority: Critical > Attachments: HDFS-4362.patch > > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas reassigned HDFS-4362: - Assignee: Suresh Srinivas > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas >Priority: Critical > Attachments: HDFS-4362.patch > > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode
[ https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544564#comment-13544564 ] liuyang commented on HDFS-4295: --- Thanks Aaron for this workaround. 1. calls the SecureDataNodeStarter.init() method while running as root; 2. then calls SecureDataNodeStarter.start() method while running as hdfs; how to execute the script for statrting the datanode? > Using port 1023 should be valid when starting Secure DataNode > - > > Key: HDFS-4295 > URL: https://issues.apache.org/jira/browse/HDFS-4295 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.0.0-alpha >Reporter: Stephen Chu >Assignee: Stephen Chu > Labels: trivial > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4295.patch > > > In SecureDataNodeStarter: > {code} > if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) && > UserGroupInformation.isSecurityEnabled()) { > throw new RuntimeException("Cannot start secure datanode with > unprivileged ports"); > } > {code} > This prohibits using port 1023, but this should be okay because only root can > listen to ports below 1024. > We can change the >= to >. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4362: -- Affects Version/s: 2.0.2-alpha > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Suresh Srinivas >Priority: Critical > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4362: -- Hadoop Flags: Incompatible change > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Suresh Srinivas >Priority: Critical > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-4362) GetDelegationTokenResponseProto does not handle null token
[ https://issues.apache.org/jira/browse/HDFS-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas moved YARN-317 to HDFS-4362: Issue Type: Bug (was: Improvement) Key: HDFS-4362 (was: YARN-317) Project: Hadoop HDFS (was: Hadoop YARN) > GetDelegationTokenResponseProto does not handle null token > -- > > Key: HDFS-4362 > URL: https://issues.apache.org/jira/browse/HDFS-4362 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Suresh Srinivas >Priority: Critical > > While working on HADOOP-9173, I notice that the > GetDelegationTokenResponseProto declares the token field as required. However > return of null token is to be expected both as defined in > FileSystem#getDelegationToken() and also based on HDFS implementation. This > jira intends to make the field as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken
[ https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544548#comment-13544548 ] Aaron T. Myers commented on HDFS-4306: -- The latest patch looks good to me. +1 pending Jenkins. > PBHelper.convertLocatedBlock miss convert BlockToken > > > Key: HDFS-4306 > URL: https://issues.apache.org/jira/browse/HDFS-4306 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, > HDFS-4306.v4.patch > > > PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss > convert BlockToken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken
[ https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4306: Attachment: HDFS-4306.v4.patch Thanks for noticing this. Correct the bug mentioned in last comment. > PBHelper.convertLocatedBlock miss convert BlockToken > > > Key: HDFS-4306 > URL: https://issues.apache.org/jira/browse/HDFS-4306 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch, > HDFS-4306.v4.patch > > > PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss > convert BlockToken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4333) Using right default value for creating files in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544540#comment-13544540 ] Binglin Chang commented on HDFS-4333: - bq. I'm re-classifying this as an improvement, since it seems to be purely a style improvement. This makes sense, I will submit a patch after HADOOP-9155 is done. > Using right default value for creating files in HDFS > > > Key: HDFS-4333 > URL: https://issues.apache.org/jira/browse/HDFS-4333 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.2-alpha >Reporter: Binglin Chang >Assignee: Binglin Chang >Priority: Minor > > The default permission to create file should be 0666 rather than 0777, > HADOOP-9155 add default permission for files and change > localfilesystem.create to use this default value, this jira makes the similar > change with hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode
[ https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544539#comment-13544539 ] Aaron T. Myers commented on HDFS-4295: -- You need to be root in order to bind to low ports, jsvc doesn't have anything to do with that. The DN uses jsvc so that it can start as root, bind to the low port, and then switch users to hdfs for the rest of its run. So, you need to start the DN as root when enabling security and running with jsvc - no way around that. > Using port 1023 should be valid when starting Secure DataNode > - > > Key: HDFS-4295 > URL: https://issues.apache.org/jira/browse/HDFS-4295 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.0.0-alpha >Reporter: Stephen Chu >Assignee: Stephen Chu > Labels: trivial > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4295.patch > > > In SecureDataNodeStarter: > {code} > if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) && > UserGroupInformation.isSecurityEnabled()) { > throw new RuntimeException("Cannot start secure datanode with > unprivileged ports"); > } > {code} > This prohibits using port 1023, but this should be okay because only root can > listen to ports below 1024. > We can change the >= to >. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4295) Using port 1023 should be valid when starting Secure DataNode
[ https://issues.apache.org/jira/browse/HDFS-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544538#comment-13544538 ] liuyang commented on HDFS-4295: --- The jsvc program is used to start the DataNode listening on low port numbers, but DataNode cannot be started while running as no root user. The exception as follow: Initializing secure datanode resources java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:164) Caused by: java.net.SocketException: Permission denied at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:76) ... 5 more Cannot load daemon anything I missed? > Using port 1023 should be valid when starting Secure DataNode > - > > Key: HDFS-4295 > URL: https://issues.apache.org/jira/browse/HDFS-4295 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.0.0-alpha >Reporter: Stephen Chu >Assignee: Stephen Chu > Labels: trivial > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4295.patch > > > In SecureDataNodeStarter: > {code} > if ((ss.getLocalPort() >= 1023 || listener.getPort() >= 1023) && > UserGroupInformation.isSecurityEnabled()) { > throw new RuntimeException("Cannot start secure datanode with > unprivileged ports"); > } > {code} > This prohibits using port 1023, but this should be okay because only root can > listen to ports below 1024. > We can change the >= to >. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544511#comment-13544511 ] Hadoop QA commented on HDFS-4353: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563377/02d.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3738//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3738//console This message is automatically generated. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch, 02d.patch, 02e.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4353: --- Attachment: 02e.patch remove some parameters that aren't necessary now that we aren't tracking Peers in PeerServer. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch, 02d.patch, 02e.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4304) Make FSEditLogOp.MAX_OP_SIZE configurable
[ https://issues.apache.org/jira/browse/HDFS-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544489#comment-13544489 ] Colin Patrick McCabe commented on HDFS-4304: Last comment should read "the JN that is actually going to be writing the bytes to disk." > Make FSEditLogOp.MAX_OP_SIZE configurable > - > > Key: HDFS-4304 > URL: https://issues.apache.org/jira/browse/HDFS-4304 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 2.0.3-alpha >Reporter: Todd Lipcon >Assignee: Colin Patrick McCabe > Attachments: HDFS-4304.001.patch, HDFS-4304.002.patch, > HDFS-4304.003.patch, HDFS-4304.004.patch, HDFS-4304.005.patch > > > Today we ran into an issue where a NN had logged a very large op, greater > than the 1.5MB MAX_OP_SIZE constant. In order to successfully load the edits, > we had to patch with a larger constant. This constant should be configurable > so that we wouldn't have to recompile in these odd cases. Additionally, I > think the default should be bumped a bit higher, since it's only a safeguard > against OOME, and people tend to run NNs with multi-GB heaps. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3371) EditLogFileInputStream: be more careful about closing streams when we're done with them.
[ https://issues.apache.org/jira/browse/HDFS-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-3371. Resolution: Won't Fix I guess there's no point in doing this one as the outer stream will close all the inner ones. > EditLogFileInputStream: be more careful about closing streams when we're done > with them. > > > Key: HDFS-3371 > URL: https://issues.apache.org/jira/browse/HDFS-3371 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Attachments: HDFS-3371.001.patch, HDFS-3371.002.patch > > > EditLogFileInputStream#EditLogFileInputStream should be more careful about > closing streams when there is an exception thrown. Also, > EditLogFileInputStream#close should close all of the streams we opened in the > constructor, not just one of them (although the file-backed one is probably > the most important). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4304) Make FSEditLogOp.MAX_OP_SIZE configurable
[ https://issues.apache.org/jira/browse/HDFS-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4304: --- Attachment: HDFS-4304.005.patch This version of the patch makes MAX_OP_SIZE configurable in production and not just in recovery mode. I didn't implement the warning when writing an over-long opcode. It would be really tricky to do this "right"-- for example, if you're using QJM, the maximum op size on your local NameNode may not be the same as on the NN that is actually going to be writing the bytes to disk. I think that would get messy. This is just a minimal change to make something which wasn't configurable before, configurable. > Make FSEditLogOp.MAX_OP_SIZE configurable > - > > Key: HDFS-4304 > URL: https://issues.apache.org/jira/browse/HDFS-4304 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 2.0.3-alpha >Reporter: Todd Lipcon >Assignee: Colin Patrick McCabe > Attachments: HDFS-4304.001.patch, HDFS-4304.002.patch, > HDFS-4304.003.patch, HDFS-4304.004.patch, HDFS-4304.005.patch > > > Today we ran into an issue where a NN had logged a very large op, greater > than the 1.5MB MAX_OP_SIZE constant. In order to successfully load the edits, > we had to patch with a larger constant. This constant should be configurable > so that we wouldn't have to recompile in these odd cases. Additionally, I > think the default should be bumped a bit higher, since it's only a safeguard > against OOME, and people tend to run NNs with multi-GB heaps. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544347#comment-13544347 ] Colin Patrick McCabe commented on HDFS-4352: oops... hit enter too early. To continue, I added these improvements in the patch for HDFS-4353. Let me know if this addresses your post-commit comment. > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544346#comment-13544346 ] Colin Patrick McCabe commented on HDFS-4352: Hi Nicholas, I added asserts to let you know if you did not set a mandatory parameter, and also JavaDoc documentation > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4353: --- Attachment: 02d.patch This version addresses Todd's comment about removing tracking of {{Peer}} objects from {{PeerServer}} (since we might as well continue doing it in {{DataXceiver}}). It also addresses some comments Nicholas made in HDFS-4352 about making it clear which parameters to {{BlockReaderFactory#newBlockReader}} are mandatory or not. The function now {{asserts}} if a mandatory parameter is not set. Also added a bunch of documentation about what the parameters do. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch, 02d.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544292#comment-13544292 ] Hadoop QA commented on HDFS-4353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563368/02c.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3737//console This message is automatically generated. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4353: --- Attachment: 02c.patch resubmitting because the previous build failed for an unrelated reason. The error message was: {code} [exec] /usr/bin/ld: cannot find -lstdc++ {code} which is not a problem with this patch > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02c.patch, > 02-cumulative.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3970) BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead of DataStorage to read prev version file.
[ https://issues.apache.org/jira/browse/HDFS-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544267#comment-13544267 ] Todd Lipcon commented on HDFS-3970: --- thanks for the test, Andrew. +1 pending Jenkins > BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead > of DataStorage to read prev version file. > --- > > Key: HDFS-3970 > URL: https://issues.apache.org/jira/browse/HDFS-3970 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Vinay >Assignee: Vinay > Attachments: hdfs-3970-1.patch, HDFS-3970.patch > > > {code}// read attributes out of the VERSION file of previous directory > DataStorage prevInfo = new DataStorage(); > prevInfo.readPreviousVersionProperties(bpSd);{code} > In the above code snippet BlockPoolSliceStorage instance should be used. > other wise rollback results in 'storageType' property missing which will not > be there in initial VERSION file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4230) Listing all the current snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4230: Attachment: HDFS-4230.004.patch New patch uploaded which puts HdfsFileStatus as a field of SnapshottableDirectoryStatus to address Nicholas's comments. > Listing all the current snapshottable directories > - > > Key: HDFS-4230 > URL: https://issues.apache.org/jira/browse/HDFS-4230 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, > HDFS-4230.002.patch, HDFS-4230.003.patch, HDFS-4230.004.patch > > > Provide functionality to provide user with metadata about all the > snapshottable directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4353) Encapsulate connections to peers in Peer and PeerServer classes
[ https://issues.apache.org/jira/browse/HDFS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544189#comment-13544189 ] Hadoop QA commented on HDFS-4353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563247/02c.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3734//console This message is automatically generated. > Encapsulate connections to peers in Peer and PeerServer classes > --- > > Key: HDFS-4353 > URL: https://issues.apache.org/jira/browse/HDFS-4353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: 02b-cumulative.patch, 02c.patch, 02-cumulative.patch > > > Encapsulate connections to peers into the {{Peer}} and {{PeerServer}} > classes. Since many Java classes may be involved with these connections, it > makes sense to create a container for them. For example, a connection to a > peer may have an input stream, output stream, readablebytechannel, encrypted > output stream, and encrypted input stream associated with it. > This makes us less dependent on the {{NetUtils}} methods which use > {{instanceof}} to manipulate socket and stream states based on the runtime > type. it also paves the way to introduce UNIX domain sockets which don't > inherit from {{java.net.Socket}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544178#comment-13544178 ] Hadoop QA commented on HDFS-4253: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563303/hdfs4253-5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3733//console This message is automatically generated. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, > hdfs4253-4.txt, hdfs4253-5.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4350) Make enabling of stale marking on read and write paths independent
[ https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-4350: -- Attachment: hdfs-4350-1.patch > Make enabling of stale marking on read and write paths independent > -- > > Key: HDFS-4350 > URL: https://issues.apache.org/jira/browse/HDFS-4350 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-4350-1.patch > > > Marking of datanodes as stale for the read and write path was introduced in > HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, > {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and > {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently > exists a dependency, since you cannot enable write marking without also > enabling read marking, since the first key enables both checking of staleness > and read marking. > I propose renaming the first key to > {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled > if either of the keys are set. This will allow read and write marking to be > enabled independently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4350) Make enabling of stale marking on read and write paths independent
[ https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-4350: -- Attachment: (was: hdfs-4350-1.patch) > Make enabling of stale marking on read and write paths independent > -- > > Key: HDFS-4350 > URL: https://issues.apache.org/jira/browse/HDFS-4350 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-4350-1.patch > > > Marking of datanodes as stale for the read and write path was introduced in > HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, > {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and > {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently > exists a dependency, since you cannot enable write marking without also > enabling read marking, since the first key enables both checking of staleness > and read marking. > I propose renaming the first key to > {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled > if either of the keys are set. This will allow read and write marking to be > enabled independently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544162#comment-13544162 ] Colin Patrick McCabe commented on HDFS-4352: bq. I may miss something. This seems making the code much more confusing: how does the caller determine which parameters to set before passing BlockReaderFactory.Params? For example, which methods require ioStreamPair and which methods do not? {{ioStreamPair}} is never required, but if it is present it will be used. I agree that this is confusing, but it was like that before-- you just couldn't see it because the parameter list was so long! Most callers passed null for this-- by not setting it, they get that automatically now. By the way, HDFS-4353 gets rid of some {{ioStreamPair}} and combines together the {{Socket}} and the {{ioStreamPair}} in a class called {{Peer}}. So hopefully this will make things less confusing by reducing the number of parameters that have to be passed. The general idea behind {{Params}} is that if you don't set a parameter, it just becomes some reasonable default. The only mandatory members are {{Socket}}, {{Block}}, and {{Conf}}. I suppose we could add those 3 parameters to the {{Params}} constructor if that seemed clearer. We also should document parameters more fully. > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4244) Support deleting snapshots
[ https://issues.apache.org/jira/browse/HDFS-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544158#comment-13544158 ] Aaron T. Myers commented on HDFS-4244: -- bq. The new patch addressed your comments 1, 2, and 4. Thanks, Jing. The updated patch looks good in this regard. bq. I will file a separate jira to handle all the DFSAdmin command related issues. Cool. Please let me know when you do. The latest patch looks good from my perspective. > Support deleting snapshots > -- > > Key: HDFS-4244 > URL: https://issues.apache.org/jira/browse/HDFS-4244 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4244.001.patch, HDFS-4244.002.patch, > HDFS-4244.003.patch, HDFS-4244.004.patch > > > Provide functionality to delete a snapshot, given the name of the snapshot > and the path to the directory where the snapshot was taken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4351) Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes
[ https://issues.apache.org/jira/browse/HDFS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544147#comment-13544147 ] Jing Zhao commented on HDFS-4351: - I also run test-patch for Andrew, and the result looks good: {noformat} -1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 225 new Findbugs (version 2.0.1) warnings. {noformat} > Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes > -- > > Key: HDFS-4351 > URL: https://issues.apache.org/jira/browse/HDFS-4351 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 1.2.0, 3.0.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-4351-2.patch, hdfs-4351-3.patch, hdfs-4351-4.patch, > hdfs-4351-branch-1-1.patch, hdfs-4351.patch > > > There's a bug in {{BlockPlacementPolicyDefault#chooseTarget}} with stale node > avoidance enabled (HDFS-3912). If a NotEnoughReplicasException is thrown in > the call to {{chooseRandom()}}, {{numOfReplicas}} is not updated together > with the partial result in {{result}} since it is pass by value. The retry > call to {{chooseTarget}} then uses this incorrect value. > This can be seen if you enable stale node detection for > {{TestReplicationPolicy#testChooseTargetWithMoreThanAvaiableNodes()}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4306) PBHelper.convertLocatedBlock miss convert BlockToken
[ https://issues.apache.org/jira/browse/HDFS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544137#comment-13544137 ] Aaron T. Myers commented on HDFS-4306: -- Thanks for refactoring it, Binglin, but I think there's a little bug in your new compare method. Seems like this loop should be comparing the values from the two different arrays, {{ei}} and {{ai}}: {code} +for (int i = 0; i < ei.length ; i++) { + compare(ai[i], ai[i]); +} {code} Otherwise the patch looks good to me. > PBHelper.convertLocatedBlock miss convert BlockToken > > > Key: HDFS-4306 > URL: https://issues.apache.org/jira/browse/HDFS-4306 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.2-alpha >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-4306.patch, HDFS-4306.v2.patch, HDFS-4306.v3.patch > > > PBHelper.convertLocatedBlock(from protobuf array to primitive array) miss > convert BlockToken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4256) Support for concatenation of files into a single file in branch-1
[ https://issues.apache.org/jira/browse/HDFS-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544131#comment-13544131 ] Suresh Srinivas commented on HDFS-4256: --- +1 for the patch. > Support for concatenation of files into a single file in branch-1 > - > > Key: HDFS-4256 > URL: https://issues.apache.org/jira/browse/HDFS-4256 > Project: Hadoop HDFS > Issue Type: Test > Components: namenode >Affects Versions: 1.2.0 >Reporter: Suresh Srinivas >Assignee: Sanjay Radia > Attachments: HDFS-4256-2.patch, HDFS-4256.patch > > > HDFS-222 added support concatenation of multiple files in a directory into a > single file. This helps several use cases where writes can be parallelized > and several folks have expressed in this functionality. > This jira intends to make changes equivalent from HDFS-222 into branch-1 to > be made available release 1.2.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4230) Listing all the current snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544092#comment-13544092 ] Jing Zhao commented on HDFS-4230: - bq. Sure, here or separate JIRA doesn't matter to me. Would you like to file it or shall I? I filed HDFS-4361 for this. > Listing all the current snapshottable directories > - > > Key: HDFS-4230 > URL: https://issues.apache.org/jira/browse/HDFS-4230 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, > HDFS-4230.002.patch, HDFS-4230.003.patch > > > Provide functionality to provide user with metadata about all the > snapshottable directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4361) When listing snapshottable directories, only return those where the user has permission to take snapshots
Jing Zhao created HDFS-4361: --- Summary: When listing snapshottable directories, only return those where the user has permission to take snapshots Key: HDFS-4361 URL: https://issues.apache.org/jira/browse/HDFS-4361 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4351) Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes
[ https://issues.apache.org/jira/browse/HDFS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544090#comment-13544090 ] Andrew Wang commented on HDFS-4351: --- Just wanted to note that I fixed my branch-1 env and the test passes. > Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes > -- > > Key: HDFS-4351 > URL: https://issues.apache.org/jira/browse/HDFS-4351 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 1.2.0, 3.0.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-4351-2.patch, hdfs-4351-3.patch, hdfs-4351-4.patch, > hdfs-4351-branch-1-1.patch, hdfs-4351.patch > > > There's a bug in {{BlockPlacementPolicyDefault#chooseTarget}} with stale node > avoidance enabled (HDFS-3912). If a NotEnoughReplicasException is thrown in > the call to {{chooseRandom()}}, {{numOfReplicas}} is not updated together > with the partial result in {{result}} since it is pass by value. The retry > call to {{chooseTarget}} then uses this incorrect value. > This can be seen if you enable stale node detection for > {{TestReplicationPolicy#testChooseTargetWithMoreThanAvaiableNodes()}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4230) Listing all the current snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544083#comment-13544083 ] Aaron T. Myers commented on HDFS-4230: -- bq. It may be even better to list only the snapshottable directories that the user has permission to take snapshots. That sounds fine to me as well. bq. Let's think about it carefully and work on it separately. Sure, here or separate JIRA doesn't matter to me. Would you like to file it or shall I? > Listing all the current snapshottable directories > - > > Key: HDFS-4230 > URL: https://issues.apache.org/jira/browse/HDFS-4230 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4230.001.patch, HDFS-4230.001.patch, > HDFS-4230.002.patch, HDFS-4230.003.patch > > > Provide functionality to provide user with metadata about all the > snapshottable directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3970) BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead of DataStorage to read prev version file.
[ https://issues.apache.org/jira/browse/HDFS-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-3970: -- Attachment: hdfs-3970-1.patch I added a test to Vinay's patch which attempts to rollback a block pool. Tested by running the test before and after the fix to confirm. > BlockPoolSliceStorage#doRollback(..) should use BlockPoolSliceStorage instead > of DataStorage to read prev version file. > --- > > Key: HDFS-3970 > URL: https://issues.apache.org/jira/browse/HDFS-3970 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Vinay >Assignee: Vinay > Attachments: hdfs-3970-1.patch, HDFS-3970.patch > > > {code}// read attributes out of the VERSION file of previous directory > DataStorage prevInfo = new DataStorage(); > prevInfo.readPreviousVersionProperties(bpSd);{code} > In the above code snippet BlockPoolSliceStorage instance should be used. > other wise rollback results in 'storageType' property missing which will not > be there in initial VERSION file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4278) The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on when security is enabled.
[ https://issues.apache.org/jira/browse/HDFS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544062#comment-13544062 ] Eli Collins commented on HDFS-4278: --- Or we should ERROR loudly if security is enabled and this config is not set if that's easier. > The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on > when security is enabled. > > > Key: HDFS-4278 > URL: https://issues.apache.org/jira/browse/HDFS-4278 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J > Labels: newbie > > When enabling security, one has to manually enable the config > DFS_BLOCK_ACCESS_TOKEN_ENABLE (dfs.block.access.token.enable). Since these > two are coupled, we could make it turn itself on automatically if we find > security to be enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4278) The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on when security is enabled.
[ https://issues.apache.org/jira/browse/HDFS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-4278: -- Labels: newbie (was: ) > The DFS_BLOCK_ACCESS_TOKEN_ENABLE config should be automatically turned on > when security is enabled. > > > Key: HDFS-4278 > URL: https://issues.apache.org/jira/browse/HDFS-4278 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J > Labels: newbie > > When enabling security, one has to manually enable the config > DFS_BLOCK_ACCESS_TOKEN_ENABLE (dfs.block.access.token.enable). Since these > two are coupled, we could make it turn itself on automatically if we find > security to be enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson updated HDFS-4253: Attachment: hdfs4253-5.txt New patch implementing cmccabe's suggestion to presume stable search and shuffle first. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, > hdfs4253-4.txt, hdfs4253-5.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543950#comment-13543950 ] Andy Isaacson commented on HDFS-4253: - bq. Just shuffle the array and then call Arrays.sort with your custom comparator. Since Arrays.sort is a stable sort (doesn't re-order equal elements) the randomization will still be there. OK, I think I see your point after squinting at it for a while. I would find that code quite impenetrable and surprising without a comment explaining why, though. And I've taken a shot at writing a good comment and not succeeded. Could you suggest wording to explain the subtle approach you've suggested, so that others can understand this code in the future? > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253-3.txt, > hdfs4253-4.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated HDFS-4270: Fix Version/s: 0.23.6 I pulled this into branch-0.23 > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543905#comment-13543905 ] Hudson commented on HDFS-4270: -- Integrated in Hadoop-Mapreduce-trunk #1305 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/]) HDFS-4270. Introduce soft and hard limits for max replication so that replications of the highest priority are allowed to choose a source datanode that has reached its soft limit but not the hard limit. Contributed by Derek Dagit (Revision 1428739) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception
[ https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543904#comment-13543904 ] Hudson commented on HDFS-4302: -- Integrated in Hadoop-Mapreduce-trunk #1305 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/]) HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. Contributed by Eugene Koontz. (Revision 1428590) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java > Precondition in EditLogFileInputStream's length() method is checked too early > in NameNode startup, causing fatal exception > -- > > Key: HDFS-4302 > URL: https://issues.apache.org/jira/browse/HDFS-4302 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, namenode >Reporter: Eugene Koontz >Assignee: Eugene Koontz > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4302.patch, HDFS-4302.patch > > > When bringing up a namenode in standby mode, where DEBUG is enabled for > namenode, the namenode will hit the following code in > {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}: > {code} > if (LOG.isDebugEnabled()) { > LOG.debug("edit log length: " + in.length() + ", start txid: " > + expectedStartingTxId + ", last txid: " + lastTxId); > } > {code}. > However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, > this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} > is initialized (before the HTTP client connects to the remote edit log server > (i.e. the journal node)). This causes the following precondition to fail in > {{EditLogFileInputStream:length()}}: > {code} > Preconditions.checkState(advertisedSize != -1, > "must get input stream before length is available"); > {code} > which shuts down the namenode with the following log messages and stack trace: > {code} > 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine > (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms > 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager > (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream > manifests: > 172.16.175.1:8485: [[1,3]] > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(605)) - Planning to load image : > FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000, > cpktTxId=000) > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(168)) > - Loading image file > /tmp/hadoop-data/dfs/name/current/fsimage_000 using no > compression > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(171)) > - Number of files = 1 > 2012-12-11 10:45:33,356 INFO namenode.FSImage > (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under > construction = 0 > 2012-12-11 10:45:33,357 INFO namenode.FSImage (FSImageFormat.java:load(193)) > - Image file of size 119 loaded in 0 seconds. > 2012-12-11 10:45:33,357 INFO namenode.FSImage > (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from > /tmp/hadoop-data/dfs/name/current/fsimage_000 > 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) > - About to load edits: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,359 INFO namenode.FSImage (FSImage.java:loadEdits(694)) > - Reading > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > expecting start txid #1 > 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping > client > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC > Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 > from hdfs/eugenes-macbook-pro.lo...@example.com: closed > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client > (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from > hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0 > 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - > Exception in namenode join > java.la
[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp
[ https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543900#comment-13543900 ] Hudson commented on HDFS-4346: -- Integrated in Hadoop-Mapreduce-trunk #1305 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/]) Add file which was accidentally missed during commit of HDFS-4346. (Revision 1428560) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java > Refactor INodeId and GenerationStamp > > > Key: HDFS-4346 > URL: https://issues.apache.org/jira/browse/HDFS-4346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Fix For: 3.0.0 > > Attachments: h4346_20121231.patch, h4346_20130101.patch, > h4346_20130102.patch > > > The INodeId and GenerationStamp classes are very similar. It is better to > refactor them for code sharing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543896#comment-13543896 ] Hudson commented on HDFS-4352: -- Integrated in Hadoop-Mapreduce-trunk #1305 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1305/]) HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. Contributed by Colin Patrick McCabe. (Revision 1428729) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode
[ https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543871#comment-13543871 ] Hudson commented on HDFS-4357: -- Integrated in Hadoop-Hdfs-Snapshots-Branch-build #60 (See [https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/60/]) HDFS-4357. Fix a bug that if an inode is replaced, further INode operations should apply to the new inode. Contributed by Jing Zhao (Revision 1428780) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428780 Files : * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java > After calling replaceSelf, further operations that should be applied on the > new INode may be wrongly applied to the original INode > -- > > Key: HDFS-4357 > URL: https://issues.apache.org/jira/browse/HDFS-4357 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: Snapshot (HDFS-2802) > > Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, > HDFS-4357.003.patch > > > An example is in INode#setModificationTime, if the INode is an instance of > INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, > the change of the modification time should happen in the new > INodeDirectoryWithSnapshot instead of the original INodeDirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543862#comment-13543862 ] Hudson commented on HDFS-4270: -- Integrated in Hadoop-Hdfs-trunk #1275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/]) HDFS-4270. Introduce soft and hard limits for max replication so that replications of the highest priority are allowed to choose a source datanode that has reached its soft limit but not the hard limit. Contributed by Derek Dagit (Revision 1428739) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception
[ https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543861#comment-13543861 ] Hudson commented on HDFS-4302: -- Integrated in Hadoop-Hdfs-trunk #1275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/]) HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. Contributed by Eugene Koontz. (Revision 1428590) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java > Precondition in EditLogFileInputStream's length() method is checked too early > in NameNode startup, causing fatal exception > -- > > Key: HDFS-4302 > URL: https://issues.apache.org/jira/browse/HDFS-4302 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, namenode >Reporter: Eugene Koontz >Assignee: Eugene Koontz > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4302.patch, HDFS-4302.patch > > > When bringing up a namenode in standby mode, where DEBUG is enabled for > namenode, the namenode will hit the following code in > {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}: > {code} > if (LOG.isDebugEnabled()) { > LOG.debug("edit log length: " + in.length() + ", start txid: " > + expectedStartingTxId + ", last txid: " + lastTxId); > } > {code}. > However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, > this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} > is initialized (before the HTTP client connects to the remote edit log server > (i.e. the journal node)). This causes the following precondition to fail in > {{EditLogFileInputStream:length()}}: > {code} > Preconditions.checkState(advertisedSize != -1, > "must get input stream before length is available"); > {code} > which shuts down the namenode with the following log messages and stack trace: > {code} > 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine > (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms > 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager > (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream > manifests: > 172.16.175.1:8485: [[1,3]] > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(605)) - Planning to load image : > FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000, > cpktTxId=000) > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(168)) > - Loading image file > /tmp/hadoop-data/dfs/name/current/fsimage_000 using no > compression > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(171)) > - Number of files = 1 > 2012-12-11 10:45:33,356 INFO namenode.FSImage > (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under > construction = 0 > 2012-12-11 10:45:33,357 INFO namenode.FSImage (FSImageFormat.java:load(193)) > - Image file of size 119 loaded in 0 seconds. > 2012-12-11 10:45:33,357 INFO namenode.FSImage > (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from > /tmp/hadoop-data/dfs/name/current/fsimage_000 > 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) > - About to load edits: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,359 INFO namenode.FSImage (FSImage.java:loadEdits(694)) > - Reading > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > expecting start txid #1 > 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping > client > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC > Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 > from hdfs/eugenes-macbook-pro.lo...@example.com: closed > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client > (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from > hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0 > 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - > Exception in namenode join > java.lang.Illegal
[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp
[ https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543856#comment-13543856 ] Hudson commented on HDFS-4346: -- Integrated in Hadoop-Hdfs-trunk #1275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/]) Add file which was accidentally missed during commit of HDFS-4346. (Revision 1428560) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java > Refactor INodeId and GenerationStamp > > > Key: HDFS-4346 > URL: https://issues.apache.org/jira/browse/HDFS-4346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Fix For: 3.0.0 > > Attachments: h4346_20121231.patch, h4346_20130101.patch, > h4346_20130102.patch > > > The INodeId and GenerationStamp classes are very similar. It is better to > refactor them for code sharing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543852#comment-13543852 ] Hudson commented on HDFS-4352: -- Integrated in Hadoop-Hdfs-trunk #1275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1275/]) HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. Contributed by Colin Patrick McCabe. (Revision 1428729) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4227) Document dfs.namenode.resource.*
[ https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543814#comment-13543814 ] Hadoop QA commented on HDFS-4227: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563251/hdfs-4227-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestLeaseRecovery2 org.apache.hadoop.hdfs.TestFileAppend2 org.apache.hadoop.hdfs.TestFileAppend3 org.apache.hadoop.hdfs.TestDFSRemove org.apache.hadoop.hdfs.security.TestDelegationToken org.apache.hadoop.hdfs.qjournal.server.TestJournalNode {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3730//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3730//console This message is automatically generated. > Document dfs.namenode.resource.* > -- > > Key: HDFS-4227 > URL: https://issues.apache.org/jira/browse/HDFS-4227 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Eli Collins >Assignee: Daisuke Kobayashi > Labels: newbie > Attachments: hdfs-4227-1.patch, hdfs-4227-2.patch, HDFS-4227.patch > > > Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section > the in the HDFS docs that covers local directories. > {{dfs.namenode.resource.check.interval}} - the interval in ms at which the > NameNode resource checker runs (default is 5000) > {{dfs.namenode.resource.du.reserved}} - the amount of space to > reserve/require for a NN storage directory (default is 100mb) > {{dfs.namenode.resource.checked.volumes}} - a list of local directories for > the NN resource checker to check in addition to the local edits directories > (default is empty). > {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of > redundant NN storage volumes required (default is 1). If no redundant > resources are available we don't enter SM if there are sufficient required > resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode
[ https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4357. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) I have committed this. Thanks, Jing! > After calling replaceSelf, further operations that should be applied on the > new INode may be wrongly applied to the original INode > -- > > Key: HDFS-4357 > URL: https://issues.apache.org/jira/browse/HDFS-4357 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: Snapshot (HDFS-2802) > > Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, > HDFS-4357.003.patch > > > An example is in INode#setModificationTime, if the INode is an instance of > INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, > the change of the modification time should happen in the new > INodeDirectoryWithSnapshot instead of the original INodeDirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode
[ https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4357: - Component/s: (was: datanode) Hadoop Flags: Reviewed +1 patch looks good. > After calling replaceSelf, further operations that should be applied on the > new INode may be wrongly applied to the original INode > -- > > Key: HDFS-4357 > URL: https://issues.apache.org/jira/browse/HDFS-4357 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, > HDFS-4357.003.patch > > > An example is in INode#setModificationTime, if the INode is an instance of > INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, > the change of the modification time should happen in the new > INodeDirectoryWithSnapshot instead of the original INodeDirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543771#comment-13543771 ] Hudson commented on HDFS-4270: -- Integrated in Hadoop-Yarn-trunk #86 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/86/]) HDFS-4270. Introduce soft and hard limits for max replication so that replications of the highest priority are allowed to choose a source datanode that has reached its soft limit but not the hard limit. Contributed by Derek Dagit (Revision 1428739) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception
[ https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543770#comment-13543770 ] Hudson commented on HDFS-4302: -- Integrated in Hadoop-Yarn-trunk #86 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/86/]) HDFS-4302. Fix fatal exception when starting NameNode with DEBUG logs. Contributed by Eugene Koontz. (Revision 1428590) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428590 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java > Precondition in EditLogFileInputStream's length() method is checked too early > in NameNode startup, causing fatal exception > -- > > Key: HDFS-4302 > URL: https://issues.apache.org/jira/browse/HDFS-4302 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, namenode >Reporter: Eugene Koontz >Assignee: Eugene Koontz > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4302.patch, HDFS-4302.patch > > > When bringing up a namenode in standby mode, where DEBUG is enabled for > namenode, the namenode will hit the following code in > {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}: > {code} > if (LOG.isDebugEnabled()) { > LOG.debug("edit log length: " + in.length() + ", start txid: " > + expectedStartingTxId + ", last txid: " + lastTxId); > } > {code}. > However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, > this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} > is initialized (before the HTTP client connects to the remote edit log server > (i.e. the journal node)). This causes the following precondition to fail in > {{EditLogFileInputStream:length()}}: > {code} > Preconditions.checkState(advertisedSize != -1, > "must get input stream before length is available"); > {code} > which shuts down the namenode with the following log messages and stack trace: > {code} > 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine > (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms > 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager > (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream > manifests: > 172.16.175.1:8485: [[1,3]] > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(605)) - Planning to load image : > FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_000, > cpktTxId=000) > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage > (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(168)) > - Loading image file > /tmp/hadoop-data/dfs/name/current/fsimage_000 using no > compression > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(171)) > - Number of files = 1 > 2012-12-11 10:45:33,356 INFO namenode.FSImage > (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under > construction = 0 > 2012-12-11 10:45:33,357 INFO namenode.FSImage (FSImageFormat.java:load(193)) > - Image file of size 119 loaded in 0 seconds. > 2012-12-11 10:45:33,357 INFO namenode.FSImage > (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from > /tmp/hadoop-data/dfs/name/current/fsimage_000 > 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) > - About to load edits: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,359 INFO namenode.FSImage (FSImage.java:loadEdits(694)) > - Reading > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > expecting start txid #1 > 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping > client > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC > Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 > from hdfs/eugenes-macbook-pro.lo...@example.com: closed > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client > (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from > hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0 > 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - > Exception in namenode join > java.lang.IllegalStat
[jira] [Commented] (HDFS-4346) Refactor INodeId and GenerationStamp
[ https://issues.apache.org/jira/browse/HDFS-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543765#comment-13543765 ] Hudson commented on HDFS-4346: -- Integrated in Hadoop-Yarn-trunk #86 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/86/]) Add file which was accidentally missed during commit of HDFS-4346. (Revision 1428560) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428560 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SequentialNumber.java > Refactor INodeId and GenerationStamp > > > Key: HDFS-4346 > URL: https://issues.apache.org/jira/browse/HDFS-4346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Fix For: 3.0.0 > > Attachments: h4346_20121231.patch, h4346_20130101.patch, > h4346_20130102.patch > > > The INodeId and GenerationStamp classes are very similar. It is better to > refactor them for code sharing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543761#comment-13543761 ] Hudson commented on HDFS-4352: -- Integrated in Hadoop-Yarn-trunk #86 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/86/]) HDFS-4352. Encapsulate arguments to BlockReaderFactory in a class. Contributed by Colin Patrick McCabe. (Revision 1428729) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428729 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4352) Encapsulate arguments to BlockReaderFactory in a class
[ https://issues.apache.org/jira/browse/HDFS-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543752#comment-13543752 ] Tsz Wo (Nicholas), SZE commented on HDFS-4352: -- I may miss something. This seems making the code much more confusing: how does the caller determine which parameters to set before passing BlockReaderFactory.Params? For example, which methods require ioStreamPair and which methods do not? > Encapsulate arguments to BlockReaderFactory in a class > -- > > Key: HDFS-4352 > URL: https://issues.apache.org/jira/browse/HDFS-4352 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 3.0.0 > > Attachments: 01b.patch, 01.patch > > > Encapsulate the arguments to BlockReaderFactory in a class to avoid having to > pass around 10+ arguments to a few different functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode
[ https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4357: Attachment: HDFS-4357.003.patch Simplified code in the new patch. > After calling replaceSelf, further operations that should be applied on the > new INode may be wrongly applied to the original INode > -- > > Key: HDFS-4357 > URL: https://issues.apache.org/jira/browse/HDFS-4357 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch, > HDFS-4357.003.patch > > > An example is in INode#setModificationTime, if the INode is an instance of > INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, > the change of the modification time should happen in the new > INodeDirectoryWithSnapshot instead of the original INodeDirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4357) After calling replaceSelf, further operations that should be applied on the new INode may be wrongly applied to the original INode
[ https://issues.apache.org/jira/browse/HDFS-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4357: Attachment: HDFS-4357.002.patch Integrate new testcases that update metadata of directories into TestSnapshot#testSnapshot. Also fix another similar bug for setOwner(..). > After calling replaceSelf, further operations that should be applied on the > new INode may be wrongly applied to the original INode > -- > > Key: HDFS-4357 > URL: https://issues.apache.org/jira/browse/HDFS-4357 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-4357.001.patch, HDFS-4357.002.patch > > > An example is in INode#setModificationTime, if the INode is an instance of > INodeDirectory, after replacing itself with a new INodeDirectoryWithSnapshot, > the change of the modification time should happen in the new > INodeDirectoryWithSnapshot instead of the original INodeDirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543695#comment-13543695 ] Hudson commented on HDFS-4270: -- Integrated in Hadoop-trunk-Commit #3174 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3174/]) HDFS-4270. Introduce soft and hard limits for max replication so that replications of the highest priority are allowed to choose a source datanode that has reached its soft limit but not the hard limit. Contributed by Derek Dagit (Revision 1428739) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1428739 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4270) Replications of the highest priority should be allowed to choose a source datanode that has reached its max replication limit
[ https://issues.apache.org/jira/browse/HDFS-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4270: - Resolution: Fixed Fix Version/s: 2.0.3-alpha 3.0.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Derek! > Replications of the highest priority should be allowed to choose a source > datanode that has reached its max replication limit > - > > Key: HDFS-4270 > URL: https://issues.apache.org/jira/browse/HDFS-4270 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 0.23.5 >Reporter: Derek Dagit >Assignee: Derek Dagit >Priority: Minor > Fix For: 3.0.0, 2.0.3-alpha > > Attachments: HDFS-4270-branch-0.23.patch, > HDFS-4270-branch-0.23.patch, HDFS-4270.patch, HDFS-4270.patch, > HDFS-4270.patch, HDFS-4270.patch > > > Blocks that have been identified as under-replicated are placed on one of > several priority queues. The highest priority queue is essentially reserved > for situations in which only one replica of the block exists, meaning it > should be replicated ASAP. > The ReplicationMonitor periodically computes replication work, and a call to > BlockManager#chooseUnderReplicatedBlocks selects a given number of > under-replicated blocks, choosing blocks from the highest-priority queue > first and working down to the lowest priority queue. > In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a > source for the replication is chosen from among datanodes that have an > available copy of the block needed. This is done in > BlockManager#chooseSourceDatanode. > chooseSourceDatanode's job is to choose the datanode for replication. It > chooses a random datanode from the available datanodes that has not reached > its replication limit (preferring datanodes that are currently > decommissioning). > However, the priority queue of the block does not inform the logic. If a > datanode holds the last remaining replica of a block and has already reached > its replication limit, the node is dismissed outright and the replication is > not scheduled. > In some situations, this could lead to data loss, as the last remaining > replica could disappear if an opportunity is not taken to schedule a > replication. It would be better to waive the max replication limit in cases > of highest-priority block replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4227) Document dfs.namenode.resource.*
[ https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543687#comment-13543687 ] Mark Yang commented on HDFS-4227: - Hi Aaron, thanks for your review. About the description "If no redundant resources are available we ..", I have no idea what it means either after checking the source code. I just remove this line. > Document dfs.namenode.resource.* > -- > > Key: HDFS-4227 > URL: https://issues.apache.org/jira/browse/HDFS-4227 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Eli Collins >Assignee: Daisuke Kobayashi > Labels: newbie > Attachments: hdfs-4227-1.patch, hdfs-4227-2.patch, HDFS-4227.patch > > > Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section > the in the HDFS docs that covers local directories. > {{dfs.namenode.resource.check.interval}} - the interval in ms at which the > NameNode resource checker runs (default is 5000) > {{dfs.namenode.resource.du.reserved}} - the amount of space to > reserve/require for a NN storage directory (default is 100mb) > {{dfs.namenode.resource.checked.volumes}} - a list of local directories for > the NN resource checker to check in addition to the local edits directories > (default is empty). > {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of > redundant NN storage volumes required (default is 1). If no redundant > resources are available we don't enter SM if there are sufficient required > resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira