[jira] [Commented] (HDFS-7068) Support multiple block placement policies
[ https://issues.apache.org/jira/browse/HDFS-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348676#comment-14348676 ] Zesheng Wu commented on HDFS-7068: -- Just go ahead, thanks for your work. > Support multiple block placement policies > - > > Key: HDFS-7068 > URL: https://issues.apache.org/jira/browse/HDFS-7068 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Zesheng Wu >Assignee: Walter Su > > According to the code, the current implement of HDFS only supports one > specific type of block placement policy, which is BlockPlacementPolicyDefault > by default. > The default policy is enough for most of the circumstances, but under some > special circumstances, it works not so well. > For example, on a shared cluster, we want to erasure encode all the files > under some specified directories. So the files under these directories need > to use a new placement policy. > But at the same time, other files still use the default placement policy. > Here we need to support multiple placement policies for the HDFS. > One plain thought is that, the default placement policy is still configured > as the default. On the other hand, HDFS can let user specify customized > placement policy through the extended attributes(xattr). When the HDFS choose > the replica targets, it firstly check the customized placement policy, if not > specified, it fallbacks to the default one. > Any thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7068) Support multiple block placement policies
[ https://issues.apache.org/jira/browse/HDFS-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136853#comment-14136853 ] Zesheng Wu commented on HDFS-7068: -- bq. Indeed, right now the storage policies in HDFS-6584 only select replica locations based on storage types. I think it should be possible to extended the mechanism to cover placement policies in general. Basically, the BlockStoragePolicy class can include more hints/requirements for the chooseTargets method. I still think storage types and replica locations are two orthogonal things, extend the block placement policy will be more suitable. Anyway, you suggestion is very valuable, and I'm not so familiar with HDFS-6584, I will spend some time to check it. bq. The use case you gave is very interesting (erasure coding for a subtree). Do you have more examples what customized placement policies you need? The erasure coding example is what we currently encountered in our environment. There's no other obvious case for us so far, maybe other folks can give more examples. > Support multiple block placement policies > - > > Key: HDFS-7068 > URL: https://issues.apache.org/jira/browse/HDFS-7068 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > According to the code, the current implement of HDFS only supports one > specific type of block placement policy, which is BlockPlacementPolicyDefault > by default. > The default policy is enough for most of the circumstances, but under some > special circumstances, it works not so well. > For example, on a shared cluster, we want to erasure encode all the files > under some specified directories. So the files under these directories need > to use a new placement policy. > But at the same time, other files still use the default placement policy. > Here we need to support multiple placement policies for the HDFS. > One plain thought is that, the default placement policy is still configured > as the default. On the other hand, HDFS can let user specify customized > placement policy through the extended attributes(xattr). When the HDFS choose > the replica targets, it firstly check the customized placement policy, if not > specified, it fallbacks to the default one. > Any thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7068) Support multiple block placement policies
[ https://issues.apache.org/jira/browse/HDFS-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136662#comment-14136662 ] Zesheng Wu commented on HDFS-7068: -- [~zhz], Thanks for reply. To my understanding, storage policies and block placement policies are different things, the former is used to determine which storage type is to be used, the later is used to determine where a replica is to be placed. > Support multiple block placement policies > - > > Key: HDFS-7068 > URL: https://issues.apache.org/jira/browse/HDFS-7068 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > According to the code, the current implement of HDFS only supports one > specific type of block placement policy, which is BlockPlacementPolicyDefault > by default. > The default policy is enough for most of the circumstances, but under some > special circumstances, it works not so well. > For example, on a shared cluster, we want to erasure encode all the files > under some specified directories. So the files under these directories need > to use a new placement policy. > But at the same time, other files still use the default placement policy. > Here we need to support multiple placement policies for the HDFS. > One plain thought is that, the default placement policy is still configured > as the default. On the other hand, HDFS can let user specify customized > placement policy through the extended attributes(xattr). When the HDFS choose > the replica targets, it firstly check the customized placement policy, if not > specified, it fallbacks to the default one. > Any thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7044) Support retention policy based on access time and modify time, use XAttr to store policy
[ https://issues.apache.org/jira/browse/HDFS-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135548#comment-14135548 ] Zesheng Wu commented on HDFS-7044: -- bq. The same arguments that said that HDFS-6382 should be outside the NN apply here as well. Yes, can't agree any more. [~aw] > Support retention policy based on access time and modify time, use XAttr to > store policy > > > Key: HDFS-7044 > URL: https://issues.apache.org/jira/browse/HDFS-7044 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Attachments: Retention policy design.pdf > > > The basic idea is set retention policy on directory based on access time and > modify time and use XAttr to store policy. > Files under directory which have retention policy will be delete if meet the > retention rule. > There are three rule: > # access time > #* If (accessTime + retentionTimeForAccess < now), the file will be delete > # modify time > #* If (modifyTime + retentionTimeForModify < now), the file will be delete > # access time and modify time > #* If (accessTime + retentionTimeForAccess < now && modifyTime + > retentionTimeForModify < now ), the file will be delete -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7044) Support retention policy based on access time and modify time, use XAttr to store policy
[ https://issues.apache.org/jira/browse/HDFS-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135425#comment-14135425 ] Zesheng Wu commented on HDFS-7044: -- bq. HDFS-6382 is standalone daemon outside NameNode, HDFS-7044 will be inside NameNode, I believe HDFS-7044 will be more simple and efficient. Yes, HDFS-6382 implements a standalone daemon, but it's not hard to start the TtlManager inside NameNode, which is something like the trash emptier. What's more important for HDFS-6382 is to supply the possibility for implementing a general mechanism which can support various kinds of policies over the namespace. > Support retention policy based on access time and modify time, use XAttr to > store policy > > > Key: HDFS-7044 > URL: https://issues.apache.org/jira/browse/HDFS-7044 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Attachments: Retention policy design.pdf > > > The basic idea is set retention policy on directory based on access time and > modify time and use XAttr to store policy. > Files under directory which have retention policy will be delete if meet the > retention rule. > There are three rule: > # access time > #* If (accessTime + retentionTimeForAccess < now), the file will be delete > # modify time > #* If (modifyTime + retentionTimeForModify < now), the file will be delete > # access time and modify time > #* If (accessTime + retentionTimeForAccess < now && modifyTime + > retentionTimeForModify < now ), the file will be delete -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7068) Support multiple block placement policies
Zesheng Wu created HDFS-7068: Summary: Support multiple block placement policies Key: HDFS-7068 URL: https://issues.apache.org/jira/browse/HDFS-7068 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.5.1 Reporter: Zesheng Wu Assignee: Zesheng Wu According to the code, the current implement of HDFS only supports one specific type of block placement policy, which is BlockPlacementPolicyDefault by default. The default policy is enough for most of the circumstances, but under some special circumstances, it works not so well. For example, on a shared cluster, we want to erasure encode all the files under some specified directories. So the files under these directories need to use a new placement policy. But at the same time, other files still use the default placement policy. Here we need to support multiple placement policies for the HDFS. One plain thought is that, the default placement policy is still configured as the default. On the other hand, HDFS can let user specify customized placement policy through the extended attributes(xattr). When the HDFS choose the replica targets, it firstly check the customized placement policy, if not specified, it fallbacks to the default one. Any thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114847#comment-14114847 ] Zesheng Wu commented on HDFS-6898: -- Oh, sorry, I missed that detail in the patch. One more question, in the append() method : {code} ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten( replicaInfo.getBlockId(), replicaInfo.getNumBytes(), newGS, v, newBlkFile.getParentFile(), Thread.currentThread(), estimateBlockLen); {code} Here reserves {{estimateBlockLen}} bytes {code} v.reserveSpaceForRbw(estimateBlockLen - replicaInfo.getNumBytes()); {code} Here reserves {{estimateBlockLen - replicaInfo.getNumBytes()}} bytes What's the difference between these two? > DN must reserve space for a full block when an RBW block is created > --- > > Key: HDFS-6898 > URL: https://issues.apache.org/jira/browse/HDFS-6898 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Gopal V >Assignee: Arpit Agarwal > Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, > HDFS-6898.04.patch, HDFS-6898.05.patch > > > DN will successfully create two RBW blocks on the same volume even if the > free space is sufficient for just one full block. > One or both block writers may subsequently get a DiskOutOfSpace exception. > This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114824#comment-14114824 ] Zesheng Wu commented on HDFS-6898: -- [~arpitagarwal], what about append data to existing blocks, should we also need to reserve space? > DN must reserve space for a full block when an RBW block is created > --- > > Key: HDFS-6898 > URL: https://issues.apache.org/jira/browse/HDFS-6898 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Gopal V >Assignee: Arpit Agarwal > Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, > HDFS-6898.04.patch, HDFS-6898.05.patch > > > DN will successfully create two RBW blocks on the same volume even if the > free space is sufficient for just one full block. > One or both block writers may subsequently get a DiskOutOfSpace exception. > This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Resolution: Duplicate Status: Resolved (was: Patch Available) Duplicate of HADOOP-10251. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110645#comment-14110645 ] Zesheng Wu commented on HDFS-6827: -- [~vinayrpet], I verified your patch of HADOOP-10251 on my cluster, it works as expected. Thanks. I will resolve this issue as 'duplicated'. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106364#comment-14106364 ] Zesheng Wu commented on HDFS-6827: -- Thanks [~vinayrpet], I will try the scenario in the latest trunk code soon. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105261#comment-14105261 ] Zesheng Wu commented on HDFS-6827: -- Ping [~vinayrpet].. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103557#comment-14103557 ] Zesheng Wu commented on HDFS-6827: -- I gone through your patch of HADOOP-10251. {code} private synchronized void setLastServiceStatus(HAServiceStatus status) { this.lastServiceState = status; for (ServiceStateCallback cb : serviceStateCallbacks) { cb.reportServiceStatus(lastServiceState); } } {code} The above method is only called by the {{HealthMonitor}}, so the state check will only be performed during the health check. How does your patch handle the following two scenarios(NN1 is Active, NN2 is standby): # NN1 is restarted, but ZKFC1 isn't aware of it, ZKFC1 thinks NN1 is healthy, last state is Active, current state is Standby # Do a graceful failover from the command line tool, ZKFC1 thinks NN1 is healthy, last state is Active, current state is Standby > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103505#comment-14103505 ] Zesheng Wu commented on HDFS-6827: -- [~vinayrpet], Can you verify according to the reproduce method as I described [here | https://issues.apache.org/jira/browse/HDFS-6827?focusedCommentId=14101725&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14101725]? Thanks. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103500#comment-14103500 ] Zesheng Wu commented on HDFS-6827: -- I've looked into HADOOP-10251, it's quite different from this issue. The root cause of this issue is that ANN's ZKFC isn't aware that ANN is retarted and doesn't trigger failover. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103278#comment-14103278 ] Zesheng Wu commented on HDFS-6827: -- bq. As I described in the issue description, our NN came up inside about 6 seconds. 1 second for Client#handleConnectionFailure() sleep, the other 5 seconds for some unknown reasons, maybe GC or network problems, we haven't found direct evidences. Sorry, this description is not very accurate. Our NN came up inside about 6 seconds. And the ZKFC retried connection exactly after NN starting successfully. There are about 6 seconds between ZKFC detected 'Connection reset by peer' and reconnected NN successfully. 1 second for {{Client#handleConnectionFailure()}} sleep is definitely, the other 5 seconds for some unknown reasons, maybe GC or network problems, we haven't found direct evidences. > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Summary: Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes (was: Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes) > Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the > target's status changing sometimes > -- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103261#comment-14103261 ] Zesheng Wu commented on HDFS-6827: -- Thanks [~stack]:) bq. Yes. Is there anything more definitive than a check for a particular state transition? (Sorry, don't know this area well). If we want to fix the bug inside ZKFC, there's no other definitive indicator according to my current knowledge of ZKFC. bq. This seems less prone to misinterpretation. Yes, this is more straightforward and less prone to misinterpretation. But change the {{MonitorHealthResponseProto}} proto may introduce an incompatible changing, if folks think this is acceptable, perhaps we can use this method. bq. Your NN came up inside a second? As I described in the issue description, our NN came up inside about 6 seconds. 1 second for {{Client#handleConnectionFailure()}} sleep, the other 5 seconds for some unknown reasons, maybe GC or network problems, we haven't found direct evidences. bq. A hacky workaround in meantime would have the NN start sleep first for a second? Yes, we can let NN sleep sometime before startup. Indeed we use this method to quick fix the bug in our production cluster temporarily. But for a long term and general solution, we should fix this in the ZKFC side. One more thing, ZKFC is a general automatic HA failover framework, it is used in HDFS, but not only for HDFS, it may be used in other system who needs automatic HA failover. From this perspective, we should fix this inside ZKFC. > Both NameNodes could be in STANDBY State due to HealthMonitor not aware of > the target's status changing sometimes > - > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101781#comment-14101781 ] Zesheng Wu commented on HDFS-6827: -- bq. In the below test, is it possible the state gets moved to standby before the service healthy check runs? Would the test get stuck in this case? It's not possible, the state is only changed during health checking or graceful failover, here's no graceful failover, so it's only changed by the health monitor. If the health monitor doesn't run, the state won't be moved. > Both NameNodes could be in STANDBY State due to HealthMonitor not aware of > the target's status changing sometimes > - > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101775#comment-14101775 ] Zesheng Wu commented on HDFS-6827: -- bq. BTW: The healthCheckLock is used to distinguish the graceful failover and the above scenario. bq. I don't know this code well. How is the above done? bq. The code in checkServiceStatus seems 'fragile' looking for a explicity transition. Is there a more explicit check that can be done to learn if 'service is restarted'? The solution for this issue is to let the ZKFC learn 'service is restarted'. One straightforward way is to add a field in the {{MonitorHealthResponseProto}} to identify that the service is restarted, for example the pid of the NN process, or a generated UUID will satisfy our requirement. Another way is to let the ZKFC learn 'service is restarted' by comparing the service's current state and last state. We choose the later one, in this way we can fix the problem inside ZKFC, don't influence other services. As we know that ZKFC supports gracefully failover from the command line tool, and during graceful failover, the ZKFC may encounter a scenario like this: the last state of the service Active, the current is Standby, and the service is healthy. This scenario is just the same as the buggy scenario described above, we must distinguish these two scenarios. So we add the {{healthCheckLock}} to 'fragile' the health checking when doing graceful failover. Hope I expressed myself clearly:) > Both NameNodes could be in STANDBY State due to HealthMonitor not aware of > the target's status changing sometimes > - > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101725#comment-14101725 ] Zesheng Wu commented on HDFS-6827: -- Thanks [~stack]. bq. This issue looks a little ugly. NameNodes stuck in standby mode? This production? What did it look like? Yes, both NameNodes stuck in standby mode, and the hbase cluster over it coundn't read/write any more. We can reproduce the issue in the following way: 1. Change the sleep time of {{Client#handleConnectionFailure()}} longer {code} try { Thread.sleep(action.delayMillis); // default is 1s, can change to 10s or longer } catch (InterruptedException e) { throw (IOException)new InterruptedIOException("Interrupted: action=" + action + ", retry policy=" + connectionRetryPolicy).initCause(e); } {code} 2. Restart the active NameNode quickly, and ensure that the NameNode starts successfully before ZKFC retrying connect. > Both NameNodes could be in STANDBY State due to HealthMonitor not aware of > the target's status changing sometimes > - > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) NameNode double standby
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090541#comment-14090541 ] Zesheng Wu commented on HDFS-6827: -- Hi [~arpitagarwal], can you help review this patch? > NameNode double standby > --- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) NameNode double standby
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Priority: Critical (was: Major) > NameNode double standby > --- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Critical > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) NameNode double standby
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087395#comment-14087395 ] Zesheng Wu commented on HDFS-6827: -- Submit a patch to fix this bug. BTW: The {{healthCheckLock}} is used to distinguish the graceful failover and the above scenario. > NameNode double standby > --- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) NameNode double standby
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Status: Patch Available (was: Open) > NameNode double standby > --- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) NameNode double standby
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Attachment: HDFS-6827.1.patch > NameNode double standby > --- > > Key: HDFS-6827 > URL: https://issues.apache.org/jira/browse/HDFS-6827 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.4.1 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6827.1.patch > > > In our production cluster, we encounter a scenario like this: ANN crashed due > to write journal timeout, and was restarted by the watchdog automatically, > but after restarting both of the NNs are standby. > Following is the logs of the scenario: > # NN1 is down due to write journal timeout: > {color:red}2014-08-03,23:02:02,219{color} INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG > # ZKFC1 detected "connection reset by peer" > {color:red}2014-08-03,23:02:02,560{color} ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: > {color:red}Connection reset by peer{color} > # NN1 wat restarted successfully by the watchdog: > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Web-server up at: xx:13201 > 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: > IPC Server listener on 13200: starting > 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean > thread started! > 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > Registered DFSClientInformation MBean > 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > NameNode up at: xx/xx:13200 > 2014-08-03,23:02:08,744 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for standby state > # ZKFC1 retried the connection and considered NN1 was healthy > {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: > Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 > SECONDS) > # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the > failover, as a result, both NNs were standby. > The root cause of this bug is that NN is restarted too quickly and ZKFC > health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6827) NameNode double standby
Zesheng Wu created HDFS-6827: Summary: NameNode double standby Key: HDFS-6827 URL: https://issues.apache.org/jira/browse/HDFS-6827 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.1 Reporter: Zesheng Wu Assignee: Zesheng Wu In our production cluster, we encounter a scenario like this: ANN crashed due to write journal timeout, and was restarted by the watchdog automatically, but after restarting both of the NNs are standby. Following is the logs of the scenario: # NN1 is down due to write journal timeout: {color:red}2014-08-03,23:02:02,219{color} INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG # ZKFC1 detected "connection reset by peer" {color:red}2014-08-03,23:02:02,560{color} ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: {color:red}Connection reset by peer{color} # NN1 wat restarted successfully by the watchdog: 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: xx:13201 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: IPC Server listener on 13200: starting 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean thread started! 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Registered DFSClientInformation MBean 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode up at: xx/xx:13200 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state # ZKFC1 retried the connection and considered NN1 was healthy {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS) # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the failover, as a result, both NNs were standby. The root cause of this bug is that NN is restarted too quickly and ZKFC health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075238#comment-14075238 ] Zesheng Wu commented on HDFS-6526: -- Thanks for feedback, [~daryn]. # About scalability and performance issues: currently the default implementation just simply traverses the whole directory tree, we can improve this to that each instance traverses a sub directory tree, and can deploy multiple instances on demand. # About bugs: can you point out in details, I would like to figure them out as soon as possible. # bq. Would you please elaborate on whether you plan to simply have the trash emptier and ttl manager run as distinct services in the same adjunct process? Or do you plan on the emptier actually leveraging/relying on ttls? As I mentioned in the design document in HDFS-6382, we want to supply a general mechanism that can run various kinds of policies on the namespace, for example TTL is one of the policies which is intended to clean expired files and directories. In the same way, we can extend to implement a trash policy to work in the way that is the same as the current trash emptier. If folks come to an agreement that move the trash emptier is reasonable, we will change 'TtlManager' to a more general name. # bq. As a general comment, a feature like this is attractive but may be highly dangerous. You many want to consider means to safeguard against a severely skewed system clock, else the ttl manager might go on mass murder spree in the filesystem Yes, I agree with you on this very much. The current patch is just an initial implementation, we will improve to support this in the future iteration. Hope that I expressed myself clearly:) > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6526.1.patch > > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067254#comment-14067254 ] Zesheng Wu commented on HDFS-6382: -- Thanks [~cutting] I will move the trash emptier into TTL daemon after HDFS-6525 and HDFS-6526 are resolved. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design-3.pdf, > HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.3.patch > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch, HDFS-6596.2.patch, > HDFS-6596.2.patch, HDFS-6596.3.patch, HDFS-6596.3.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.3.patch Fixed javadoc warnings. > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch, HDFS-6596.2.patch, > HDFS-6596.2.patch, HDFS-6596.3.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.2.patch Mmm, Unrelated test failure, just resubmit... > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch, HDFS-6596.2.patch, > HDFS-6596.2.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.2.patch Failed test is unrelated, just resubmit the patch to trigger Jenkins. > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch, HDFS-6596.2.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.2.patch Uploaded a polished version. > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Status: Patch Available (was: Open) > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.1.patch An initial patch. Since {{DataInputStream#readfully(byte[], int, int)}} is final and {{FSDataInputStream}} can't override it, so we implement a readFully with ByteBuffer. > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6596.1.patch > > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042986#comment-14042986 ] Zesheng Wu commented on HDFS-6596: -- Thanks Colin. bq. What you are proposing is basically making every {{read}} into a {{readFully}}. I don't think we want to increase the number of differences between how DFSInputStream works and how "normal" Java input streams work. The "normal" java behavior also has a good reason behind it... clients who can deal with partial reads will get a faster response time if the stream just returns what it can rather than waiting for everything. In the case of HDFS, waiting for everything might mean connecting to a remote DataNode. This could be quite a lot of latency. I agree with you that we shouldn't make every {{read}} into a {{readFully}}, and the current implementation of {{read}} has its advantage as you described. About the solution, I think that we do it in Hadoop will be better, because all users will be benefited. The current {{readFully}} for DFSInputStream is implemented as pread and inherits from FSInputStream, so I will a new {{readFully(buffer, offset, length)}} to figure this out. Any thoughts? > Improve InputStream when read spans two blocks > -- > > Key: HDFS-6596 > URL: https://issues.apache.org/jira/browse/HDFS-6596 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In the current implementation of DFSInputStream, read(buffer, offset, length) > is implemented as following: > {code} > int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); > if (locatedBlocks.isLastBlockComplete()) { > realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); > } > int result = readBuffer(strategy, off, realLen, corruptedBlockMap); > {code} > From the above code, we can conclude that the read will return at most > (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the > caller must call read() second time to complete the request, and must wait > second time to acquire the DFSInputStream lock(read() is synchronized for > DFSInputStream). For latency sensitive applications, such as hbase, this will > result in latency pain point when they under massive race conditions. So here > we propose that we should loop internally in read() to do best effort read. > In the current implementation of pread(read(position, buffer, offset, > lenght)), it does loop internally to do best effort read. So we can refactor > to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6596) Improve InputStream when read spans two blocks
Zesheng Wu created HDFS-6596: Summary: Improve InputStream when read spans two blocks Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} >From the above code, we can conclude that the read will return at most >(blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the >caller must call read() second time to complete the request, and must wait >second time to acquire the DFSInputStream lock(read() is synchronized for >DFSInputStream). For latency sensitive applications, such as hbase, this will >result in latency pain point when they under massive race conditions. So here >we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6525: - Attachment: HDFS-6525.2.patch Updated to address [~daryn]'s comments. BTW: Daryn, the path is only printed when the ttl is inherited, so I will keep the format 'ttl : [path]". > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch, HDFS-6525.2.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041572#comment-14041572 ] Zesheng Wu commented on HDFS-6525: -- Thanks [~daryn], I will update the patch to address your comments immediately. > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040683#comment-14040683 ] Zesheng Wu commented on HDFS-6382: -- Hi guys, I've uploaded an initial implementation on HDFS-6525 and HDFS-6526 separately, hope you can take a look at, any comments will be appreciated. Thanks in advance. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design-3.pdf, > HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6526: - Target Version/s: 2.5.0 Status: Patch Available (was: Open) > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6526.1.patch > > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6526: - Attachment: HDFS-6526.1.patch Initial implementation, the unit test depends on HDFS-6525, so we should commit HDFS-6525 before commit this. > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6526.1.patch > > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6525: - Target Version/s: 2.5.0 Status: Patch Available (was: Open) > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6525: - Attachment: HDFS-6525.1.patch Initial implementation. > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6382: - Attachment: HDFS-TTL-Design-3.pdf Update the document according to the implementation. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design-3.pdf, > HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040435#comment-14040435 ] Zesheng Wu commented on HDFS-6507: -- Thanks [~vinayrpet] for reviewing the patch. Thanks all for feedback. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Fix For: 2.5.0 > > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038235#comment-14038235 ] Zesheng Wu commented on HDFS-6507: -- Hi [~vinayrpet], it seems that folks do not have other opinions. Can you help to commit the patch? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037078#comment-14037078 ] Zesheng Wu commented on HDFS-6507: -- Since it still has some potential value when used in clusters with multiple namespaces in the future, we keep it now? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037069#comment-14037069 ] Zesheng Wu commented on HDFS-6507: -- I eventually catch your point:) But I think ":8020" is still better than null, especially for cluster with multiple namespaces, how do you think? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037058#comment-14037058 ] Zesheng Wu commented on HDFS-6507: -- If we remove this, the log message will like this "Refresh xx for null success", right? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037044#comment-14037044 ] Zesheng Wu commented on HDFS-6507: -- You mean that we should revert the change to ProxyAndInfo and remove addresses in logs messages also? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036720#comment-14036720 ] Zesheng Wu commented on HDFS-6507: -- Thanks for feedback [~jingzhao] I still think that we should send requests to all NNs. If people thinks some commands are heavy(just as you mentioned), they can use the {{-fs}} option to specify a NN to do his operation. As the current default behavior of the commands(setSafeMode/metaSave, etc) doesn't meet our expectation(you also think so), we should correct it. The way you suggested is just a compromise, and it's easier to be friendly for people, by not easier to other tools who use the DFSAdmin shell. How do you think? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034891#comment-14034891 ] Zesheng Wu commented on HDFS-6507: -- The failed test doesn't relate to this issue. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.8.patch > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch, HDFS-6507.8.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034856#comment-14034856 ] Zesheng Wu commented on HDFS-6507: -- Yes, I will figure it out immediately > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.7.patch Failed tests don't relate to this issue, just resubmit the #7 patch to trigger the jenkins. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch, HDFS-6507.7.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.7.patch Fix breaked test. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch, HDFS-6507.7.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.6.patch The failed test {{TestCrcCorruption}} doesn't relate to this issue I run it on my local machine and passed The javadoc warning is also weird, I run the mvn command on my local machine, and there's no such warning. Just resubmit the patch, and trigger the Jenkins. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch, > HDFS-6507.6.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033725#comment-14033725 ] Zesheng Wu commented on HDFS-6507: -- It seems that finalizeUpgrade() doesn't output any prompt messages, you mean that we should remove the printed messages? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.5.patch Some minor polishes. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch, HDFS-6507.5.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033754#comment-14033754 ] Zesheng Wu commented on HDFS-6507: -- Mmm, maybe in this case, user can use {{-fs}} generic option to specify a NN to operate? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033738#comment-14033738 ] Zesheng Wu commented on HDFS-6507: -- bq. One more query I have. using this implementation we can execute commands successfully when all namenodes of a nameservice are up and running. But what if standby nodes are down for maintainance and these comes first in configurations...? >From my understanding, in this case we can just fail the operation, and users >can retry after standby nodes are up. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033737#comment-14033737 ] Zesheng Wu commented on HDFS-6507: -- OK, let me figure these too out:) > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4-inprogress.patch, HDFS-6507.4.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.4.patch Updated according to Vinay's comments. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, > HDFS-6507.4.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033693#comment-14033693 ] Zesheng Wu commented on HDFS-6507: -- Oh, It seems that NameNode also has a {{ getAddress(URI filesystemURI)}}, this will work? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033688#comment-14033688 ] Zesheng Wu commented on HDFS-6507: -- I checked the related code, ProxyAndInfo is used in 3 places: {{NameNodeProxies#createProxyWithLossyRetryHandler}}, {{NameNodeProxies#createProxy}}, {{NameNodeProxies#createProxyWithLossyRetryHandler}}, in the frist place we can obtain NN address directly, but in the last two places, we can not obtain NN's address directly, we only have NN's URI. In non-HA case, we can get NN address by {{ NameNode.getAddress(nameNodeUri)}}, by in HA case, it seems not easy. How do you think? > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.3.patch New patch addressed Vinay's review comments. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033480#comment-14033480 ] Zesheng Wu commented on HDFS-6507: -- Thanks [~vinayrpet] for reviewing the patch, all comments are reasonable to me, I will generate a new patch soon to address your comments. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032082#comment-14032082 ] Zesheng Wu commented on HDFS-6382: -- [~ste...@apache.org] Thanks for your feedback. We have discussed that whether to use a MR job or a standalone daemon, and most people upstream has come to an agreement that a standalone daemon is reasonable and acceptable. You can go through the earlier discussion. [~aw] Thanks for your feedback. Your suggestion is really valuable and firms our confidence to implement it as a standalone daemon. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.2.patch Fix breaked tests. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030507#comment-14030507 ] Zesheng Wu commented on HDFS-6507: -- Mmm, seems that some test failed, I will figure it out soon. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6526: - Description: This issue is used to track development of HDFS TtlManager, for details see HDFS-6382. (was: This issue is used to track development of HDFS TtlManager, for details see HDFS -6382.) > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030366#comment-14030366 ] Zesheng Wu commented on HDFS-6382: -- I filed two sub-tasks to track the development of this feature. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6526) Implement HDFS TtlManager
Zesheng Wu created HDFS-6526: Summary: Implement HDFS TtlManager Key: HDFS-6526 URL: https://issues.apache.org/jira/browse/HDFS-6526 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu This issue is used to track development of HDFS TtlManager, for details see HDFS -6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6525) FsShell supports HDFS TTL
Zesheng Wu created HDFS-6525: Summary: FsShell supports HDFS TTL Key: HDFS-6525 URL: https://issues.apache.org/jira/browse/HDFS-6525 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, tools Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu This issue is used to track development of supporting HDFS TTL for FsShell, for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Tags: dfsadmin Target Version/s: 3.0.0, 2.5.0 (was: 3.0.0) Status: Patch Available (was: Open) > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Attachment: HDFS-6507.1.patch Attached initial version of the implementation. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6507.1.patch > > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029025#comment-14029025 ] Zesheng Wu commented on HDFS-6507: -- OK, I'll go ahead for implementation. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028986#comment-14028986 ] Zesheng Wu commented on HDFS-6507: -- [~vinayrpet], Thanks for your feedback. bq. Only worry is, if commands pass on one namenode and fails on another namenode, how to handle the failures.? whether just log the errors or rollback? >From what I know so far, all the operations mentioned in the description are >idempotent, so when a command fails on one(some) of the NNs, users can just >retry the command. What we need to do is report the error clearly to users, >either by logs or exit codes. Rollback is too complex. bq. If manual operations required before command execution, then its user's responsibility to make sure that configurations updated at both namenodes before command execution. The manual operations required before command execution can be done by tools in batches, for example, if users want to do refreshNodes, they can upload new excludes/includes file to all of the NNs without caring which NN is Active or Standby. After uploading the files, they just run refreshNodes command, it's enough. bq. Only Active leave safemode, which on failover again puts the cluster into safemode. If we decide to send set safemode for both NN, we should let the user know whether there's failure clearly. If there's failure, users should do retry until all succeed. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028870#comment-14028870 ] Zesheng Wu commented on HDFS-6507: -- Fix one mistake: bq. The existing refresh commands do not accept a host:port parameter. As [~jingzhao] mentioned, the generic option {{-fs}} can specify host:port for those commands. To summarize, the main purpose of this jira is listed below: # Make the default action correctly and naturally: a. Commands which should take effect on ANN, will take effect on ANN by default; b. Commands which should take effect on both ANN and SNN, will take effect on both ANN and SNN by default # Improve usability: a. Users do not need to care that which NN is active and what the NN's host:port is, just run the command; b. Commands that should take effect on both ANN and SNN needn't run twice with respective host:port. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028843#comment-14028843 ] Zesheng Wu commented on HDFS-6507: -- [~jingzhao], Thanks for feedback. I've checked HDFS-5147. I think -fs can only solve part of the problems, not all. This jira is intended to solve the problem thoroughly. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028814#comment-14028814 ] Zesheng Wu commented on HDFS-6507: -- bq. I don't see a significant advantage of having a --all or similar flag to the refresh commands, as replacing configs is still manual. Do the existing refresh commands accept a host:port parameter? If not we can probably add one. The existing refresh commands do not accept a host:port parameter. I donot propose to add a {{--all}} or similiar flags, but to send the refresh commands to all NNs. This modification is transparent for current usage, just an improvement. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028777#comment-14028777 ] Zesheng Wu commented on HDFS-6507: -- [~chrili], Thanks for your feedback. Waiting for [~wheat9] and [~arpitagarwal]'s opinions. If other folks have any suggestion, please feel free to feedback:) > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028768#comment-14028768 ] Zesheng Wu commented on HDFS-6382: -- [~szetszwo], Thanks for your valuable suggestions. bq. Using xattrs for TTL is a good idea. Do we really need ttl in milliseconds? Do you think that the daemon could guarantee such accuracy? We don't want to waste namenode memory space to store trailing zeros/digits for each ttl. How about supporting symbolic ttl notation, e.g. 10h, 5d? Yes, I agree with you that the daemon can't guarantee milliseconds accuracy, and in fact there's no need to guarantee such accuracy. As you suggested, we can use encoded bytes to save NN's memory. bq. The name "Supervisor" sounds too general. How about calling it "TtlManager" for the moment? If there are more new features added to the tool, we may change the name later. OK, "TtlManager" is more suitable for the moment. bq. For setting ttl on a directory foo, write permission permission on the parent directory of foo is not enough. Namenode also checks rwx for all subdirectories of foo for recursive delete. Nice catch, If we want to conform to the delete semantics mentioned by Colin, we should check the subdirectories recursively. bq. BTW, permission could be changed from time to time. A user may be able to delete a file/dir at the time of setting TTL but the same user may not have permission to delete the same file/dir when the ttl expires. The deleting work will be done by a super user(which the "TtlManager" runs as), seems this is not a problem? bq. I suggest not to check additional permission requirement on setting ttl but run as the particular user when deleting the file. Then we need to add username to the ttl xattr. Good point, but adding the username to the ttl xattr requires more space of NN's memory, we should do the trade-off whether it's worth doing. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028748#comment-14028748 ] Zesheng Wu commented on HDFS-6507: -- Since [~arpitagarwal] and [~chrili] are leading the development of HADOOP-10376, I would like to ask for you guys' idea about (2). Any suggestions are welcome and will be very appreciated. Thanks in advance. > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028732#comment-14028732 ] Zesheng Wu commented on HDFS-6507: -- HADOOP-10376 implements a refresh command like this {{[-refresh \[arg1..argn\]}}, users should specify the {{host:ipc_port}} when they want to run the refresh command. From the point of of view functionality, it's enough; but from the point of view of usability, refresh both NNs transparently is more friendly for users. Because HADOOP-10376 just supplies a new way of refreshing, and the old way still exists and will exist for a long time for backward compatibility, the (2) still has value to do, how do you think about this? [~wheat9] > Improve DFSAdmin to support HA cluster better > - > > Key: HDFS-6507 > URL: https://issues.apache.org/jira/browse/HDFS-6507 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > Currently, the commands supported in DFSAdmin can be classified into three > categories according to the protocol used: > 1. ClientProtocol > Commands in this category generally implement by calling the corresponding > function of the DFSClient class, and will call the corresponding remote > implementation function at the NN side finally. At the NN side, all these > operations are classified into five categories: UNCHECKED, READ, WRITE, > CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only > allows UNCHECKED operations. In the current implementation of DFSClient, it > will connect one NN first, if the first NN is not Active and the operation is > not allowed, it will failover to the second NN. So here comes the problem, > some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, > refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as > UNCHECKED operations, and when executing these commands in the DFSAdmin > command line, they will be sent to a definite NN, no matter it is Active or > Standby. This may result in two problems: > a. If the first tried NN is standby, and the operation takes effect only on > Standby NN, which is not the expected result. > b. If the operation needs to take effect on both NN, but it takes effect on > only one NN. In the future, when there is a NN failover, there may have > problems. > Here I propose the following improvements: > a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL > operations, we should classify it clearly. > b. If the command can not be classified as one of the above four operations, > or if the command needs to take effect on both NN, we should send the request > to both Active and Standby NNs. > 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, > RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, > RefreshCallQueueProtocol > Commands in this category, including refreshServiceAcl, > refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and > refreshCallQueue, are implemented by creating a corresponding RPC proxy and > sending the request to remote NN. In the current implementation, these > requests will be sent to a definite NN, no matter it is Active or Standby. > Here I propose that we sent these requests to both NNs. > 3. ClientDatanodeProtocol > Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage
[ https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028646#comment-14028646 ] Zesheng Wu commented on HDFS-6503: -- [~wheat9] Thank you for reviewing the patch:) > Fix typo of DFSAdmin restoreFailedStorage > - > > Key: HDFS-6503 > URL: https://issues.apache.org/jira/browse/HDFS-6503 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Fix For: 2.5.0 > > Attachments: HDFS-6503.patch > > > Fix typo: restoreFaileStorage should be restoreFailedStorage -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6382: - Attachment: HDFS-TTL-Design -2.pdf Updated the documents to address Colin's suggestions. Thanks Colin for your valuable suggestions:) > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design -2.pdf, HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027325#comment-14027325 ] Zesheng Wu commented on HDFS-6382: -- bq. Even if it's not implemented at first, we should think about the configuration required here. I think we want the ability to email the admins when things go wrong. Possibly the notifier could be pluggable or have several policies. There was nothing in the doc about configuration in general, which I think we need to fix. For example, how is rate limiting configurable? How do we notify admins that the rate is too slow to finish in the time given? OK, I will update the document and post a new version soon. bq. You can't delete a file in HDFS unless you have write permission on the containing directory. Whether you have write permission on the file itself is not relevant. So I would expect the same semantics here (probably enforced by setfacl itself). That's reasonable, I'll figure it out clearly in the document. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Description: Currently, the commands supported in DFSAdmin can be classified into three categories according to the protocol used: 1. ClientProtocol Commands in this category generally implement by calling the corresponding function of the DFSClient class, and will call the corresponding remote implementation function at the NN side finally. At the NN side, all these operations are classified into five categories: UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect one NN first, if the first NN is not Active and the operation is not allowed, it will failover to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems: a. If the first tried NN is standby, and the operation takes effect only on Standby NN, which is not the expected result. b. If the operation needs to take effect on both NN, but it takes effect on only one NN. In the future, when there is a NN failover, there may have problems. Here I propose the following improvements: a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations, we should classify it clearly. b. If the command can not be classified as one of the above four operations, or if the command needs to take effect on both NN, we should send the request to both Active and Standby NNs. 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, RefreshCallQueueProtocol Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the request to remote NN. In the current implementation, these requests will be sent to a definite NN, no matter it is Active or Standby. Here I propose that we sent these requests to both NNs. 3. ClientDatanodeProtocol Commands in this category are handled correctly, no need to improve. was: Currently, the commands supported in DFSAdmin can be classified into three categories according to the protocol used: # ClientProtocol Commands in this category generally implement by calling the corresponding function of the DFSClient class, and will call the corresponding remote implementation function at the NN side finally. At the NN side, all these operations are classified into five categories: UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect one NN first, if the first NN is not Active and the operation is not allowed, it will failover to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems: #* If the first tried NN is standby, and the operation takes effect only on Standby NN, which is not the expected result. #* If the operation needs to take effect on both NN, but it takes effect on only one NN. In the future, when there is a NN failover, there may have problems. Here I propose the following improvements: a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations, we should classify it clearly. b. If the command can not be classified as one of the above four operations, or if the command needs to take effect on both NN, we should send the request to both Active and Standby NNs. # Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, RefreshCallQueueProtocol Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the request to remote NN. In the current implementation, these requests will be sent to a definite NN, no matter it is Active or Standby. Here I propose that we sent these requests to both NNs. # ClientDatanodeProtocol Commands in this category are handled correctly, no need to improve. > Improve DFSAdmin to support HA cluster better >
[jira] [Updated] (HDFS-6507) Improve DFSAdmin to support HA cluster better
[ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6507: - Description: Currently, the commands supported in DFSAdmin can be classified into three categories according to the protocol used: # ClientProtocol Commands in this category generally implement by calling the corresponding function of the DFSClient class, and will call the corresponding remote implementation function at the NN side finally. At the NN side, all these operations are classified into five categories: UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect one NN first, if the first NN is not Active and the operation is not allowed, it will failover to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems: #* If the first tried NN is standby, and the operation takes effect only on Standby NN, which is not the expected result. #* If the operation needs to take effect on both NN, but it takes effect on only one NN. In the future, when there is a NN failover, there may have problems. Here I propose the following improvements: a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations, we should classify it clearly. b. If the command can not be classified as one of the above four operations, or if the command needs to take effect on both NN, we should send the request to both Active and Standby NNs. # Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, RefreshCallQueueProtocol Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the request to remote NN. In the current implementation, these requests will be sent to a definite NN, no matter it is Active or Standby. Here I propose that we sent these requests to both NNs. # ClientDatanodeProtocol Commands in this category are handled correctly, no need to improve. was: Currently, the commands supported in DFSAdmin can be classified into three categories according to the protocol used: 1.ClientProtocol Commands in this category generally implement by calling the corresponding function of the DFSClient class, and will call the corresponding remote implementation function at the NN side finally. At the NN side, all these operations are classified into five categories: UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect one NN first, if the first NN is not Active and the operation is not allowed, it will failover to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems: a. If the first tried NN is standby, and the operation takes effect only on Standby NN, which is not the expected result. b. If the operation needs to take effect on both NN, but it takes effect on only one NN. In the future, when there is a NN failover, there may have problems. Here I propose the following improvements: a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations, we should classify it clearly. b. If the command can not be classified as one of the above four operations, or if the command needs to take effect on both NN, we should send the request to both Active and Standby NNs. 2.Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, RefreshCallQueueProtocol Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the request to remote NN. In the current implementation, these requests will be sent to a definite NN, no matter it is Active or Standby. Here I propose that we sent these requests to both NNs. 3.ClientDatanodeProtocol Commands in this category are handled correctly, no need to improve. > Improve DFSAdmin to support HA cluster better > ---
[jira] [Created] (HDFS-6507) Improve DFSAdmin to support HA cluster better
Zesheng Wu created HDFS-6507: Summary: Improve DFSAdmin to support HA cluster better Key: HDFS-6507 URL: https://issues.apache.org/jira/browse/HDFS-6507 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Currently, the commands supported in DFSAdmin can be classified into three categories according to the protocol used: 1.ClientProtocol Commands in this category generally implement by calling the corresponding function of the DFSClient class, and will call the corresponding remote implementation function at the NN side finally. At the NN side, all these operations are classified into five categories: UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect one NN first, if the first NN is not Active and the operation is not allowed, it will failover to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace, restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems: a. If the first tried NN is standby, and the operation takes effect only on Standby NN, which is not the expected result. b. If the operation needs to take effect on both NN, but it takes effect on only one NN. In the future, when there is a NN failover, there may have problems. Here I propose the following improvements: a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations, we should classify it clearly. b. If the command can not be classified as one of the above four operations, or if the command needs to take effect on both NN, we should send the request to both Active and Standby NNs. 2.Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol, RefreshUserMappingsProtocol, RefreshCallQueueProtocol Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the request to remote NN. In the current implementation, these requests will be sent to a definite NN, no matter it is Active or Standby. Here I propose that we sent these requests to both NNs. 3.ClientDatanodeProtocol Commands in this category are handled correctly, no need to improve. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026047#comment-14026047 ] Zesheng Wu commented on HDFS-6382: -- Thanks [~cmccabe] for your feedback. bq. For the MR strategy, it seems like this could be parallelized fairly easily. For example, if you have 5 MR tasks, you can calculate the hash of each path, and then task 1 can do all the paths that are 0 mod 5, task 2 can do all the paths that are 1 mod 5, and so forth. MR also doesn't introduce extra dependencies since HDFS and MR are packaged together. You mean that we scan the whole namespace at first and then split it into 5 pieces according to hash of the path, why do we just complete the work during the first scanning process? If I misunderstand your meaning, please point out. bq. I don't understand what you mean by "the mapreduce strategy will have additional overheads." What overheads are you foreseeing? Possible overheads: Starting a mapreduce job needs to split the input, start an AppMaster, collect result from random machines (Perhaps 'overheads' is not a proper word here) bq. I don't understand what you mean by this. What will be done automatically? Here "automatically" means we do not have to rely on external tools, the daemon itself can manage the work well. bq. How are you going to implement HA for the standalone daemon? Good point. As you suggested, one approach is save the state in HDFS and simply restart it when it fails. But managing the state is a complex work, I am considering how to simplify this. One possible simpler approach is that we can consider that the daemon is stateless and simply restart it when if fails. We needn't do checkpoint and just scan from the beginning when it restarts. Because we can require that the work the daemon does is idempotent, starting from the beginning will be harmless. Possible drawbacks of the later approach are that it may waste some time and may delay the work, but they are acceptable. bq. I don't see a lot of discussion of logging and monitoring in general. How is the user going to become aware that a file was deleted because of a TTL? Or if there is an error during the delete, how will the user know? For the simplicity purpose, in the initial version, we will use logs to record which file/directory is deleted by TTL, and errors during the deleting process. bq. Does this need to be an administrator command? It doesn't need to be an administrator command, user only can setTtl on file/directory that they have write permission, and can getTtl on file/directory that they have read permission. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage
[ https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025285#comment-14025285 ] Zesheng Wu commented on HDFS-6503: -- Just fix typo, no need to add new tests. > Fix typo of DFSAdmin restoreFailedStorage > - > > Key: HDFS-6503 > URL: https://issues.apache.org/jira/browse/HDFS-6503 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6503.patch > > > Fix typo: restoreFaileStorage should be restoreFailedStorage -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage
[ https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6503: - Attachment: HDFS-6503.patch > Fix typo of DFSAdmin restoreFailedStorage > - > > Key: HDFS-6503 > URL: https://issues.apache.org/jira/browse/HDFS-6503 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6503.patch > > > Fix typo: restoreFaileStorage should be restoreFailedStorage -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage
[ https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6503: - Status: Patch Available (was: Open) > Fix typo of DFSAdmin restoreFailedStorage > - > > Key: HDFS-6503 > URL: https://issues.apache.org/jira/browse/HDFS-6503 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6503.patch > > > Fix typo: restoreFaileStorage should be restoreFailedStorage -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage
Zesheng Wu created HDFS-6503: Summary: Fix typo of DFSAdmin restoreFailedStorage Key: HDFS-6503 URL: https://issues.apache.org/jira/browse/HDFS-6503 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Priority: Minor Fix typo: restoreFaileStorage should be restoreFailedStorage -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6382: - Attachment: HDFS-TTL-Design.pdf An initial version of design doc. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-TTL-Design.pdf > > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011937#comment-14011937 ] Zesheng Wu commented on HDFS-6442: -- Thanks [~arpitagarwal]. > Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port > conficts > -- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Fix For: 3.0.0, 2.5.0 > > Attachments: HDFS-6442.1.patch, HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)