[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349867#comment-14349867 ] Liang Xie commented on HDFS-6450: - Thanks [~zhz] for the comments, i am not active in this area recently, please go ahead:) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7802) Reach xceiver limit once the watcherThread die
Liang Xie created HDFS-7802: --- Summary: Reach xceiver limit once the watcherThread die Key: HDFS-7802 URL: https://issues.apache.org/jira/browse/HDFS-7802 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Critical Our product cluster hit the Xceiver limit even w/ HADOOP-10404 HADOOP-11333, i found it was caused by DomainSocketWatcher.watcherThread gone. Attached is a possible fix, please review, thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7802) Reach xceiver limit once the watcherThread die
[ https://issues.apache.org/jira/browse/HDFS-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie resolved HDFS-7802. - Resolution: Duplicate sorry, close this jira, seems it should be in HADOOP catalogy not HDFS... Reach xceiver limit once the watcherThread die -- Key: HDFS-7802 URL: https://issues.apache.org/jira/browse/HDFS-7802 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Critical Our product cluster hit the Xceiver limit even w/ HADOOP-10404 HADOOP-11333, i found it was caused by DomainSocketWatcher.watcherThread gone. Attached is a possible fix, please review, thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7763: Attachment: HDFS-7763-002.txt How about v2, Colin? fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7763: Attachment: (was: HDFS-7763.txt) fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7763: Attachment: HDFS-7763-001.txt fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7763-001.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315692#comment-14315692 ] Liang Xie commented on HDFS-7763: - The findbugs warning is not related with current fix: Inconsistent synchronization of org.apache.hadoop.hdfs.server.namenode.BackupImage.namesystem; locked 60% of time fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7763-001.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
Liang Xie created HDFS-7763: --- Summary: fix zkfc hung issue due to not catching exception in a corner case Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7763: Attachment: jstack.4936 HDFS-7763.txt fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7763.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can continue to exec the System.exit, the attached patch is 2). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263788#comment-14263788 ] Liang Xie commented on HDFS-7523: - Hi [~saint@gmail.com], yes, let's just hold on right now, i can make a new patch with the buffer size configurable, such that we can keep the behivor just same as before, and no big surprise for all guys don't care this:) Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001 (1).txt, HDFS-7523-001 (1).txt, HDFS-7523-001.txt, HDFS-7523-001.txt, HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260901#comment-14260901 ] Liang Xie commented on HDFS-7523: - let me report number once get change, and i think we also need to make the buffer size configurable. e.g. we had hit the tcp incast issue in one of our hadoop cluster, and the rack switch buffer is overflowed, in this case, i think it would be great to have a small enough buffer size to relieve that network issue:) Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001 (1).txt, HDFS-7523-001 (1).txt, HDFS-7523-001.txt, HDFS-7523-001.txt, HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.
[ https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7561: Attachment: HDFS-7561-002.txt TestFetchImage should write fetched-image-dir under target. --- Key: HDFS-7561 URL: https://issues.apache.org/jira/browse/HDFS-7561 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Attachments: HDFS-7561-001.txt, HDFS-7561-002.txt {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, which is then never cleaned up. The problem is that it uses build.test.dir property, which seems to be invalid. Probably should use {{MiniDFSCluster.getBaseDirectory()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.
[ https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie reassigned HDFS-7561: --- Assignee: Liang Xie TestFetchImage should write fetched-image-dir under target. --- Key: HDFS-7561 URL: https://issues.apache.org/jira/browse/HDFS-7561 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Liang Xie Attachments: HDFS-7561-001.txt, HDFS-7561-002.txt {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, which is then never cleaned up. The problem is that it uses build.test.dir property, which seems to be invalid. Probably should use {{MiniDFSCluster.getBaseDirectory()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.
[ https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7561: Attachment: HDFS-7561-001.txt yes, the property should be test.build.dir instead of build.test.dir:) How about attached patch, [~shv]? i verified locally, the directory could be removed successfully now. TestFetchImage should write fetched-image-dir under target. --- Key: HDFS-7561 URL: https://issues.apache.org/jira/browse/HDFS-7561 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Attachments: HDFS-7561-001.txt {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, which is then never cleaned up. The problem is that it uses build.test.dir property, which seems to be invalid. Probably should use {{MiniDFSCluster.getBaseDirectory()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.
[ https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7561: Status: Patch Available (was: Open) TestFetchImage should write fetched-image-dir under target. --- Key: HDFS-7561 URL: https://issues.apache.org/jira/browse/HDFS-7561 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Attachments: HDFS-7561-001.txt {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, which is then never cleaned up. The problem is that it uses build.test.dir property, which seems to be invalid. Probably should use {{MiniDFSCluster.getBaseDirectory()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7553) fix the TestDFSUpgradeWithHA due to BindException
Liang Xie created HDFS-7553: --- Summary: fix the TestDFSUpgradeWithHA due to BindException Key: HDFS-7553 URL: https://issues.apache.org/jira/browse/HDFS-7553 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie see https://builds.apache.org/job/PreCommit-HDFS-Build/9092//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSUpgradeWithHA/testNfsUpgrade/ : Error Message Port in use: localhost:57896 Stacktrace java.net.BindException: Port in use: localhost:57896 at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:868) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:809) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:704) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:591) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:763) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:747) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1443) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1815) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1796) at org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA.testNfsUpgrade(TestDFSUpgradeWithHA.java:285) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7553) fix the TestDFSUpgradeWithHA due to BindException
[ https://issues.apache.org/jira/browse/HDFS-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7553: Attachment: HDFS-7553-001.txt After this change, we can ensure HttpServer2.Builder call setFindPort(true), so it will retry once BindException jumped out. Please see DFSUtil.java: {code} if (policy.isHttpEnabled()) { if (httpAddr.getPort() == 0) { builder.setFindPort(true); } {code} fix the TestDFSUpgradeWithHA due to BindException - Key: HDFS-7553 URL: https://issues.apache.org/jira/browse/HDFS-7553 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7553-001.txt see https://builds.apache.org/job/PreCommit-HDFS-Build/9092//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSUpgradeWithHA/testNfsUpgrade/ : Error Message Port in use: localhost:57896 Stacktrace java.net.BindException: Port in use: localhost:57896 at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:868) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:809) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:704) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:591) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:763) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:747) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1443) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1815) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1796) at org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA.testNfsUpgrade(TestDFSUpgradeWithHA.java:285) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7553) fix the TestDFSUpgradeWithHA due to BindException
[ https://issues.apache.org/jira/browse/HDFS-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7553: Status: Patch Available (was: Open) fix the TestDFSUpgradeWithHA due to BindException - Key: HDFS-7553 URL: https://issues.apache.org/jira/browse/HDFS-7553 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7553-001.txt see https://builds.apache.org/job/PreCommit-HDFS-Build/9092//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSUpgradeWithHA/testNfsUpgrade/ : Error Message Port in use: localhost:57896 Stacktrace java.net.BindException: Port in use: localhost:57896 at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:868) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:809) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:704) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:591) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:763) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:747) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1443) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1815) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1796) at org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA.testNfsUpgrade(TestDFSUpgradeWithHA.java:285) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7551) Fix the new findbugs warning from TransferFsImage
Liang Xie created HDFS-7551: --- Summary: Fix the new findbugs warning from TransferFsImage Key: HDFS-7551 URL: https://issues.apache.org/jira/browse/HDFS-7551 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor seems to me it came from https://issues.apache.org/jira/browse/HDFS-7373's change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7551) Fix the new findbugs warning from TransferFsImage
[ https://issues.apache.org/jira/browse/HDFS-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7551: Attachment: HDFS-7551-001.txt I just noticed it while working at HDFS-7523 Fix the new findbugs warning from TransferFsImage - Key: HDFS-7551 URL: https://issues.apache.org/jira/browse/HDFS-7551 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-7551-001.txt seems to me it came from https://issues.apache.org/jira/browse/HDFS-7373's change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7551) Fix the new findbugs warning from TransferFsImage
[ https://issues.apache.org/jira/browse/HDFS-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7551: Status: Patch Available (was: Open) Fix the new findbugs warning from TransferFsImage - Key: HDFS-7551 URL: https://issues.apache.org/jira/browse/HDFS-7551 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-7551-001.txt seems to me it came from https://issues.apache.org/jira/browse/HDFS-7373's change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7373) Clean up temporary files after fsimage transfer failures
[ https://issues.apache.org/jira/browse/HDFS-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252821#comment-14252821 ] Liang Xie commented on HDFS-7373: - [~kihwal], if i am not wrong, there really is a new findbugs warning, i filed HDFS-7551, could you please take a look at it? thanks! Clean up temporary files after fsimage transfer failures Key: HDFS-7373 URL: https://issues.apache.org/jira/browse/HDFS-7373 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-7373.patch When a fsimage (e.g. checkpoint) transfer fails, a temporary file is left in each storage directory. If the size of name space is large, these files can take up quite a bit of space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252865#comment-14252865 ] Liang Xie commented on HDFS-7523: - for the findbugs warning and failed case, it's not related with current jira, i had filed new jira to track. thanks you [~stack] Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001.txt, HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7551) Fix the new findbugs warning from TransferFsImage
[ https://issues.apache.org/jira/browse/HDFS-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252872#comment-14252872 ] Liang Xie commented on HDFS-7551: - Oops, seems i used another project's style, will fix it soon, thanks you, Akira Fix the new findbugs warning from TransferFsImage - Key: HDFS-7551 URL: https://issues.apache.org/jira/browse/HDFS-7551 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-7551-001.txt There is a findbug warning in https://builds.apache.org/job/PreCommit-HDFS-Build/9080//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html , {code} Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage In method org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List) Called method java.io.File.delete() At TransferFsImage.java:[line 577] {code} seems to me it came from https://issues.apache.org/jira/browse/HDFS-7373 's change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7551) Fix the new findbugs warning from TransferFsImage
[ https://issues.apache.org/jira/browse/HDFS-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7551: Attachment: HDFS-7551-002.txt manually edit the patch, hopefully it's ok now:) Fix the new findbugs warning from TransferFsImage - Key: HDFS-7551 URL: https://issues.apache.org/jira/browse/HDFS-7551 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-7551-001.txt, HDFS-7551-002.txt There is a findbug warning in https://builds.apache.org/job/PreCommit-HDFS-Build/9080//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html , {code} Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage In method org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List) Called method java.io.File.delete() At TransferFsImage.java:[line 577] {code} seems to me it came from https://issues.apache.org/jira/browse/HDFS-7373 's change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252877#comment-14252877 ] Liang Xie commented on HDFS-7523: - oops, missing say: TestDFSClientRetries was still not tracked by new jira, but i could not repro it in my box Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001.txt, HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7552) change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration
Liang Xie created HDFS-7552: --- Summary: change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration Key: HDFS-7552 URL: https://issues.apache.org/jira/browse/HDFS-7552 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie see https://builds.apache.org/job/PreCommit-HDFS-Build/9088//testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailureToleration/testValidVolumesAtStartup/ Per my understanding, it was due to: commit 3b173d95171d01ab55042b1162569d1cf14a8d43 Author: Colin Patrick Mccabe cmcc...@cloudera.com Date: Wed Dec 17 16:41:59 2014 -0800 HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) - volatile ListFsVolumeImpl volumes = null; + private final AtomicReferenceFsVolumeImpl[] volumes = + new AtomicReference(new FsVolumeImpl[0]); so the old case will complain at here: {code} assertTrue(The DN should have started with this directory, si.contains(dataDir1Actual.getPath())); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7552) change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration
[ https://issues.apache.org/jira/browse/HDFS-7552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7552: Attachment: HDFS-7552-001.txt change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration -- Key: HDFS-7552 URL: https://issues.apache.org/jira/browse/HDFS-7552 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7552-001.txt see https://builds.apache.org/job/PreCommit-HDFS-Build/9088//testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailureToleration/testValidVolumesAtStartup/ Per my understanding, it was due to: commit 3b173d95171d01ab55042b1162569d1cf14a8d43 Author: Colin Patrick Mccabe cmcc...@cloudera.com Date: Wed Dec 17 16:41:59 2014 -0800 HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) - volatile ListFsVolumeImpl volumes = null; + private final AtomicReferenceFsVolumeImpl[] volumes = + new AtomicReference(new FsVolumeImpl[0]); so the old case will complain at here: {code} assertTrue(The DN should have started with this directory, si.contains(dataDir1Actual.getPath())); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7552) change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration
[ https://issues.apache.org/jira/browse/HDFS-7552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7552: Status: Patch Available (was: Open) [~eddyxu] [~cmccabe] [~wheat9] mind taking a look at it? thanks change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration -- Key: HDFS-7552 URL: https://issues.apache.org/jira/browse/HDFS-7552 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 2.7.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7552-001.txt see https://builds.apache.org/job/PreCommit-HDFS-Build/9088//testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailureToleration/testValidVolumesAtStartup/ Per my understanding, it was due to: commit 3b173d95171d01ab55042b1162569d1cf14a8d43 Author: Colin Patrick Mccabe cmcc...@cloudera.com Date: Wed Dec 17 16:41:59 2014 -0800 HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) - volatile ListFsVolumeImpl volumes = null; + private final AtomicReferenceFsVolumeImpl[] volumes = + new AtomicReference(new FsVolumeImpl[0]); so the old case will complain at here: {code} assertTrue(The DN should have started with this directory, si.contains(dataDir1Actual.getPath())); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7443) Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume
[ https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253104#comment-14253104 ] Liang Xie commented on HDFS-7443: - HDFS-6931 introduced a resolveDuplicateReplicas to handle the duplicated blk from diff volume, this jira is to handle the dup in the same volume, am i right? Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume -- Key: HDFS-7443 URL: https://issues.apache.org/jira/browse/HDFS-7443 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Kihwal Lee Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-7443.001.patch, HDFS-7443.002.patch When we did an upgrade from 2.5 to 2.6 in a medium size cluster, about 4% of datanodes were not coming up. They treid data file layout upgrade for BLOCKID_BASED_LAYOUT introduced in HDFS-6482, but failed. All failures were caused by {{NativeIO.link()}} throwing IOException saying {{EEXIST}}. The data nodes didn't die right away, but the upgrade was soon retried when the block pool initialization was retried whenever {{BPServiceActor}} was registering with the namenode. After many retries, datenodes terminated. This would leave {{previous.tmp}} and {{current}} with no {{VERSION}} file in the block pool slice storage directory. Although {{previous.tmp}} contained the old {{VERSION}} file, the content was in the new layout and the subdirs were all newly created ones. This shouldn't have happened because the upgrade-recovery logic in {{Storage}} removes {{current}} and renames {{previous.tmp}} to {{current}} before retrying. All successfully upgraded volumes had old state preserved in their {{previous}} directory. In summary there were two observed issues. - Upgrade failure with {{link()}} failing with {{EEXIST}} - {{previous.tmp}} contained not the content of original {{current}}, but half-upgraded one. We did not see this in smaller scale test clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7523) Setting a socket receive buffer size in DFSClient
Liang Xie created HDFS-7523: --- Summary: Setting a socket receive buffer size in DFSClient Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7523: Status: Patch Available (was: Open) Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7523) Setting a socket receive buffer size in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-7523: Attachment: HDFS-7523-001.txt Attached a small patch. similiar with the writing stuff in DFSOutputStream: {code} * Create a socket for a write pipeline * @param first the first datanode * @param length the pipeline length * @param client client * @return the socket connected to the first datanode */ static Socket createSocketForPipeline(final DatanodeInfo first, final int length, final DFSClient client) throws IOException { final String dnAddr = first.getXferAddr( client.getConf().connectToDnViaHostname); if (DFSClient.LOG.isDebugEnabled()) { DFSClient.LOG.debug(Connecting to datanode + dnAddr); } final InetSocketAddress isa = NetUtils.createSocketAddr(dnAddr); final Socket sock = client.socketFactory.createSocket(); final int timeout = client.getDatanodeReadTimeout(length); NetUtils.connect(sock, isa, client.getRandomLocalInterfaceAddr(), client.getConf().socketTimeout); sock.setSoTimeout(timeout); sock.setSendBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); {code} Setting a socket receive buffer size in DFSClient - Key: HDFS-7523 URL: https://issues.apache.org/jira/browse/HDFS-7523 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-7523-001.txt It would be nice if we have a socket receive buffer size while creating socket from client(HBase) view, in old version it should be in DFSInputStream, in trunk it seems should be at: {code} @Override // RemotePeerFactory public Peer newConnectedPeer(InetSocketAddress addr, TokenBlockTokenIdentifier blockToken, DatanodeID datanodeId) throws IOException { Peer peer = null; boolean success = false; Socket sock = null; try { sock = socketFactory.createSocket(); NetUtils.connect(sock, addr, getRandomLocalInterfaceAddr(), dfsClientConf.socketTimeout); peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this, blockToken, datanodeId); peer.setReadTimeout(dfsClientConf.socketTimeout); {code} e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, this value sometimes is small for HBase 64k block reading in a 10G network(at least, more system call) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6827) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6827: Summary: Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes (was: NameNode double standby) Both NameNodes could be in STANDBY State due to HealthMonitor not aware of the target's status changing sometimes - Key: HDFS-6827 URL: https://issues.apache.org/jira/browse/HDFS-6827 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.1 Reporter: Zesheng Wu Assignee: Zesheng Wu Priority: Critical Attachments: HDFS-6827.1.patch In our production cluster, we encounter a scenario like this: ANN crashed due to write journal timeout, and was restarted by the watchdog automatically, but after restarting both of the NNs are standby. Following is the logs of the scenario: # NN1 is down due to write journal timeout: {color:red}2014-08-03,23:02:02,219{color} INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG # ZKFC1 detected connection reset by peer {color:red}2014-08-03,23:02:02,560{color} ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: {color:red}Connection reset by peer{color} # NN1 wat restarted successfully by the watchdog: 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: xx:13201 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: IPC Server listener on 13200: starting 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean thread started! 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Registered DFSClientInformation MBean 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode up at: xx/xx:13200 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state # ZKFC1 retried the connection and considered NN1 was healthy {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS) # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the failover, as a result, both NNs were standby. The root cause of this bug is that NN is restarted too quickly and ZKFC health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6800) Determine how Datanode layout changes should interact with rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6800: Summary: Determine how Datanode layout changes should interact with rolling upgrade (was: Detemine how Datanode layout changes should interact with rolling upgrade) Determine how Datanode layout changes should interact with rolling upgrade -- Key: HDFS-6800 URL: https://issues.apache.org/jira/browse/HDFS-6800 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe We need to handle attempts to rolling-upgrade the DataNode to a new storage directory layout. One approach is to disallow such upgrades. If we choose this approach, we should make sure that the system administrator gets a helpful error message and a clean failure when trying to use rolling upgrade to a version that doesn't support it. Based on the compatibility guarantees described in HDFS-5535, this would mean that *any* future DataNode layout changes would require a major version upgrade. Another approach would be to support rolling upgrade from an old DN storage layout to a new layout. This approach requires us to change our documentation to explain to users that they should supply the {{\-rollback}} command on the command-line when re-starting the DataNodes during rolling rollback. Currently the documentation just says to restart the DataNode normally. Another issue here is that the DataNode's usage message describes rollback options that no longer exist. The help text says that the DN supports {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-289) HDFS should blacklist datanodes that are not performing well
[ https://issues.apache.org/jira/browse/HDFS-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079048#comment-14079048 ] Liang Xie commented on HDFS-289: We just hit a HBase write performance degradation several days ago, the root cause turns out is the slow network to/from special datanode due to switch buffer problem. I am now interesting on implement a simple heuristics excluding DN feature inside DFSOutputStream. will put more here later:) HDFS should blacklist datanodes that are not performing well Key: HDFS-289 URL: https://issues.apache.org/jira/browse/HDFS-289 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur On a large cluster, a few datanodes could be under-performing. There were cases when the network connectivity of a few of these bad datanodes were degraded, resulting in long long times (in the order of two hours) to transfer blocks to and from these datanodes. A similar issue arises when disks a single disk on a datanode fail or change to read-only mode: in this case the entire datanode shuts down. HDFS should detect and handle network and disk performance degradation more gracefully. One option would be to blacklist these datanodes, de-prioritise their use and alert the administrator. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue
Liang Xie created HDFS-6766: --- Summary: optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the notifyAll be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue
[ https://issues.apache.org/jira/browse/HDFS-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6766: Attachment: HDFS-6766.txt optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6766.txt Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the notifyAll be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue
[ https://issues.apache.org/jira/browse/HDFS-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6766: Status: Patch Available (was: Open) optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6766.txt Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the notifyAll be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue
[ https://issues.apache.org/jira/browse/HDFS-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078790#comment-14078790 ] Liang Xie commented on HDFS-6766: - The failed TestBalancer was not related with current patch optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6766.txt Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the notifyAll be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue
[ https://issues.apache.org/jira/browse/HDFS-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078791#comment-14078791 ] Liang Xie commented on HDFS-6766: - bq. should be the condition timeout configurable Per my knowledge, those fixed timeout should be fine, because all of them are covered by a while loop. optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6766.txt Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the notifyAll be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071409#comment-14071409 ] Liang Xie commented on HDFS-6735: - bq. We'd not check in the test since it does not assert anything? We'd just check it in as a utility testing concurrent pread throughput? In the last of testing code snippet, there're assertion, see: {code} assertTrue(readLatency.readMs readLatency.readMs); //because we issued a pread already, so the second one should not hit //disk, even consider running on a slow VM, 1 second should be fine? assertTrue(readLatency.preadMs 1000); {code} Per assertTrue(readLatency.preadMs 1000);, we could know weather the pread() be blocked by read() or not :) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071440#comment-14071440 ] Liang Xie commented on HDFS-6735: - bq. I'd think you'd want a comment at least in LocatedBlocks#underConstruction warning an upper layer is dependent on it being final in case LocatedBlocks changes and starts to allow blocks complete under a stream. done. bq. locatedBlocks.insertRange(targetBlockIdx, newBlocks.getLocatedBlocks()); ... be inside a synchronization too? Could two threads be updating block locations at same time? yes, it's possible. but we could not put a synchronization that, it's different from synchronized (this) { + pos = offset; + blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1; + currentLocatedBlock = blk; + }, because in pread scenario, the updatePosition is false, so will never go into the synchronized (this) { + pos = offset; + blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1; + currentLocatedBlock = blk; + }. And if we put a synchronization there, so if pread reach here, it's still blocked by other monitor holder, e.g. read() :) but we can have a synchronized or rwLock in Locatedblocks class. let me try A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6735: Attachment: HDFS-6735-v2.txt A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071475#comment-14071475 ] Liang Xie commented on HDFS-6698: - TestPipelinesFailover probably is due to ulimit setting, i had seen several recent reports failed on it with too many open files try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt, HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071655#comment-14071655 ] Liang Xie commented on HDFS-6735: - The failed TestPipelinesFailover TestNamenodeCapacityReport were not related with current patch(i saw them in other recent reports) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735-v2.txt, HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
Liang Xie created HDFS-6735: --- Summary: A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6698: Issue Type: Sub-task (was: Improvement) Parent: HDFS-6735 try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071316#comment-14071316 ] Liang Xie commented on HDFS-6698: - Any thoughts, [~stack]? try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6735: Attachment: HDFS-6735.txt A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6735: Status: Patch Available (was: Open) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071335#comment-14071335 ] Liang Xie commented on HDFS-6735: - This patch includes HDFS-6698 already. With the source change, the added testPreadBlockedbyRead result in my box: {code} grep cost hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/org.apache.hadoop.hdfs.TestPread-output.txt read() cost:5008ms, pread() cost:5ms {code} Without the source change(off cause, you still need keep the readDelay injecting faults in src code), the added testPreadBlockedbyRead result in my box: {code} read() cost:5009ms, pread() cost:4912ms {code} A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6735: Description: In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) was: In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream - Key: HDFS-6735 URL: https://issues.apache.org/jira/browse/HDFS-6735 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6735.txt In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor patch against the first encourtered lock, around getFileLength, in deed, after reading code and testing, it shows still other locks we could improve. In this jira, i'll make a patch against other locks, and a simple test case to show the issue and the improved result. This is important for HBase application, since in current HFile read path, we issue all read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution is another story i had a plan to do, but probably will take more time than i expected) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6698: Attachment: HDFS-6698.txt try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6698: Status: Patch Available (was: Open) try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068361#comment-14068361 ] Liang Xie commented on HDFS-6698: - In normal situation, The HFiles which all of HBase (p)reads against to should be immuable, so i assumed the attached patch per [~saint@gmail.com]'s suggestion is enough to relieve the pread(s) were blocked by read request in HBase issue. Let's see QA result... try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068467#comment-14068467 ] Liang Xie commented on HDFS-6698: - Those three failure cases are not related with current patch. try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6698.txt HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066126#comment-14066126 ] Liang Xie commented on HDFS-6450: - bq. How is testHedgedReadBasic exercising the new functionality? Or is it just verifying nothing broke when it is turned on? It was not finished yet, this patch is a preliminary one, and i hold on due to the block reader stuff, i think i have got a feasible direction per Colin's last comments, i'll continue to do the rest asap:) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
[ https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065990#comment-14065990 ] Liang Xie commented on HDFS-6698: - bq. Makes sense Liang Xie You see this as prob in prod? [~stack], yes, we had observed a couple of times in our internal product clusters... try to optimize DFSInputStream.getFileLength() -- Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6695) Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to implement read timeouts
[ https://issues.apache.org/jira/browse/HDFS-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064550#comment-14064550 ] Liang Xie commented on HDFS-6695: - I did not look through AsynchronousfileChannel impl, but per the above link saying: This class also defines read and write methods that initiate asynchronous operations, returning a Future to represent the pending result of the operation. The Future may be used to check if the operation has completed, wait for its completion, and retrieve the result. , seems does the same idea like my pseudo code at https://issues.apache.org/jira/browse/HDFS-6286?focusedCommentId=14008621page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14008621 Investigate using Java 7's nonblocking file I/O in BlockReaderLocal to implement read timeouts -- Key: HDFS-6695 URL: https://issues.apache.org/jira/browse/HDFS-6695 Project: Hadoop HDFS Issue Type: Improvement Reporter: Colin Patrick McCabe In BlockReaderLocal, the read system call could block for a long time if the disk drive is having problems, or there is a huge amount of I/O contention. This might cause poor latency performance. In the remote block readers, we have implemented a read timeout, but we don't have one for the local block reader, since {{FileChannel#read}} doesn't support this. Once we move to JDK 7, we should investigate the {{java.nio.file}} nonblocking file I/O package to see if it could be used to implement read timeouts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6698) try to optimize DFSInputStream.getFileLength()
Liang Xie created HDFS-6698: --- Summary: try to optimize DFSInputStream.getFileLength() Key: HDFS-6698 URL: https://issues.apache.org/jira/browse/HDFS-6698 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie HBase prefers to invoke read() serving scan request, and invoke pread() serving get reqeust. Because pread() almost holds no lock. Let's image there's a read() running, because the definition is: {code} public synchronized int read {code} so no other read() request could run concurrently, this is known, but pread() also could not run... because: {code} public int read(long position, byte[] buffer, int offset, int length) throws IOException { // sanity checks dfsClient.checkOpen(); if (closed) { throw new IOException(Stream closed); } failures = 0; long filelen = getFileLength(); {code} the getFileLength() also needs lock. so we need to figure out a no lock impl for getFileLength() before HBase multi stream feature done. [~saint@gmail.com] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064564#comment-14064564 ] Liang Xie commented on HDFS-6450: - bq. I think in the short term, applications will have to turn on non-positional hedged reads explicitly, and accept some small loss in throughput for a major reduction in long-tail latency. yes, it's a trade-off :) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061760#comment-14061760 ] Liang Xie commented on HDFS-6450: - After a deep looking, it's kind of hard to reuse/maintain block reader as before. In pread(), we don't have this trouble, because we always create new block reader. In read(), if we want to support hedged read ability, in general: 1) first read(r1) using the old block reader if possible, then wait hedged read timeout setting 2) second read(r2) must create a new block reader, and submit into thread pool 3) wait the first completed task, and return final read result to client side. Here we need to set(remember) this task's block reader to DFIS's block reader variable, and should keep it open, but we also need to close the other block reader to avoid leak. Another thing need to know is that if we remember the faster block reader, if it's a remote block reader, then the following read() will bypass local read in the following r1 operations... Any thought ? [~cmccabe], [~saint@gmail.com] ... Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6450: Attachment: (was: HDFS-6450-like-pread.txt) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6450: Attachment: HDFS-6450-like-pread.txt Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6656) flakey TestProcessCorruptBlocks#testByAddingAnExtraDataNode
Liang Xie created HDFS-6656: --- Summary: flakey TestProcessCorruptBlocks#testByAddingAnExtraDataNode Key: HDFS-6656 URL: https://issues.apache.org/jira/browse/HDFS-6656 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Liang Xie failed in https://builds.apache.org/job/PreCommit-HDFS-Build/7320//testReport/org.apache.hadoop.hdfs.server.namenode/TestProcessCorruptBlocks/testByAddingAnExtraDataNode/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6656) flakey TestProcessCorruptBlocks#testByAddingAnExtraDataNode
[ https://issues.apache.org/jira/browse/HDFS-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6656: Attachment: PreCommit-HDFS-Build #7320 test - testByAddingAnExtraDataNode [Jenkins].html flakey TestProcessCorruptBlocks#testByAddingAnExtraDataNode --- Key: HDFS-6656 URL: https://issues.apache.org/jira/browse/HDFS-6656 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Attachments: PreCommit-HDFS-Build #7320 test - testByAddingAnExtraDataNode [Jenkins].html failed in https://builds.apache.org/job/PreCommit-HDFS-Build/7320//testReport/org.apache.hadoop.hdfs.server.namenode/TestProcessCorruptBlocks/testByAddingAnExtraDataNode/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056006#comment-14056006 ] Liang Xie commented on HDFS-6631: - I see, compared with my dev box logfile, found inside attached org.apache.hadoop.hdfs.TestPread-output.txt file, it did not trigger a real hedged read. I only could find log like Waited 50ms to read from 127.0.0.1:x spawning hedged read in my logfile. In your file, the execution sequence is : -read from 127.0.0.1:53908 - here the counter is 1 -throw Checksum Exception -read from 127.0.0.1:53919 - here the counter is 2 -return result , line 1127 that means all two read path gone to the if (futures.isEmpty()) { flow (L1112) so the root question is if we set hedged.read.threshold = 50ms, and Mockito.doAnswer has a Thread.sleep(50+1), this statement: {code} FutureByteBuffer future = hedgedService.poll( dfsClient.getHedgedReadTimeout(), TimeUnit.MILLISECONDS); {code} In my dev box, it did just like Javadoc says: {code} Retrieves and removes the Future representing the next completed task, waiting if necessary up to the specified wait time if none are yet present. Parameters: timeout how long to wait before giving up, in units of unit unit a TimeUnit determining how to interpret the timeout parameter Returns: the Future representing the next completed task or null if the specified waiting time elapses before one is present Throws: InterruptedException - if interrupted while waiting {code} so the future will be null. but in Chris's box, the exception from thread pool will jump out firstly, so gone to L1140 directly: catch (ExecutionException e) so per my current understanding, it should be related with os thread schedule (granularity) , we probably need to enlarge the Mockito sleep interval. TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Attachments: org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6631: Attachment: HDFS-6631.txt TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Attachments: HDFS-6631.txt, org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6631: Status: Patch Available (was: Open) TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Attachments: HDFS-6631.txt, org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056019#comment-14056019 ] Liang Xie commented on HDFS-6631: - I just uploaded a tentitive patch, [~cnauroth], could you try it in your easy-repro env ? thank you very much! TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Attachments: HDFS-6631.txt, org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie reassigned HDFS-6631: --- Assignee: Liang Xie TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Assignee: Liang Xie Attachments: HDFS-6631.txt, org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6631: Attachment: (was: HDFS-6631.txt) TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Assignee: Liang Xie Attachments: org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6631: Attachment: HDFS-6631.txt emm, why QA complain couldn't compile, i verified again it works. let me remove the old one and re-upload TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth Assignee: Liang Xie Attachments: HDFS-6631.txt, org.apache.hadoop.hdfs.TestPread-output.txt {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
[ https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054655#comment-14054655 ] Liang Xie commented on HDFS-6631: - I could not repro on my dev box so far. probably it related with this code in case Thread.sleep(hedgedReadTimeoutMillis + 1); ? if almost reach the hedged read threshold (50ms), then a slow down,e.g. gc occur, or scheduler related activity, so when this testing continues to run after 50+1 ms with cpu begin to be scheduled again, then the counter can be 2. It's just a guess, maybe a whole test log file will help us diagonise it. Chris, do you still have that log outfile copy? ps: if we enable the log level to ALL in this testing, i think it will very be easier to understand all stuff:) TestPread#testHedgedReadLoopTooManyTimes fails intermittently. -- Key: HDFS-6631 URL: https://issues.apache.org/jira/browse/HDFS-6631 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, test Affects Versions: 3.0.0, 2.5.0 Reporter: Chris Nauroth {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently. It looks like a race condition on counting the expected number of loop iterations. I can repro the test failure more consistently on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6638) shorten test run time with a smaller retry timeout setting
Liang Xie created HDFS-6638: --- Summary: shorten test run time with a smaller retry timeout setting Key: HDFS-6638 URL: https://issues.apache.org/jira/browse/HDFS-6638 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie similiar with HDFS-6614, i think it's a general test duration optimization tip, so i grep IOException, will wait for from a full test run under hdfs project, found several test cases could be optimized, so made a simple patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6638) shorten test run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6638: Attachment: HDFS-6638.txt shorten test run time with a smaller retry timeout setting -- Key: HDFS-6638 URL: https://issues.apache.org/jira/browse/HDFS-6638 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6638.txt similiar with HDFS-6614, i think it's a general test duration optimization tip, so i grep IOException, will wait for from a full test run under hdfs project, found several test cases could be optimized, so made a simple patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6638) shorten test run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6638: Status: Patch Available (was: Open) shorten test run time with a smaller retry timeout setting -- Key: HDFS-6638 URL: https://issues.apache.org/jira/browse/HDFS-6638 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6638.txt similiar with HDFS-6614, i think it's a general test duration optimization tip, so i grep IOException, will wait for from a full test run under hdfs project, found several test cases could be optimized, so made a simple patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6638) shorten test run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054813#comment-14054813 ] Liang Xie commented on HDFS-6638: - with attached patch, we could reduce more than 2.5 minutes. i set the retry value to 10ms instead of 0, just considering i am not familiar with those modified testing, so probably will have a crazy loop happened:) ps: TestEncryptedTransfer#testLongLivedReadClientAfterRestart seems failed after i tune the setting to 10ms, so i omit it in current patch. shorten test run time with a smaller retry timeout setting -- Key: HDFS-6638 URL: https://issues.apache.org/jira/browse/HDFS-6638 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6638.txt similiar with HDFS-6614, i think it's a general test duration optimization tip, so i grep IOException, will wait for from a full test run under hdfs project, found several test cases could be optimized, so made a simple patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054814#comment-14054814 ] Liang Xie commented on HDFS-6614: - Filed HDFS-6638, since this tuning tips could optimize other cases also. [~cnauroth], mind taking a look at it if possible? thank you! shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6450: Attachment: HDFS-6450-like-pread.txt Attached is a preliminary patch similar with pread() doing, means creating block reader always. It should be the simiplest way, but not very friendly for read(), since no block reader be reusable... Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6450: Status: Patch Available (was: Open) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6614: Attachment: (was: HDFS-6614-addmium.txt) shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6614: Attachment: HDFS-6614-addmium.txt shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054487#comment-14054487 ] Liang Xie commented on HDFS-6614: - [~cnauroth] you are right, i remade an addendum patch, removed the unnecessary DFSClient stuff. the reordering of Mockito just avoid the risk that closing stream failure, it's not very related with current jira though...:) shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053324#comment-14053324 ] Liang Xie commented on HDFS-6450: - [~cmccabe] [~saint@gmail.com] I made a prototype already(still adding more test cases), but i think we need to clear one thing right now: how to handle the block reader for read(). In current impl, the read() maintains a blockreader to reuse for furture read calls, and pread() will create a new blockreader in each call always. So if we want to enhance read() to have hedged read ability, should we assign the first completed read request's block reader to above maintained blockreader variable?(i guess most of situations, this behavior probably will be like change the local block reader to a remote block reader?) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6614: Attachment: HDFS-6614-addmium.txt shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052313#comment-14052313 ] Liang Xie commented on HDFS-6614: - [~cnauroth], i found still had several non-zero waiting inside chooseDataNode from TestRead log: grep will wait for hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/org.apache.hadoop.hdfs.TestPread-output.txt |grep -v will wait for 0.0 msec 2014-07-04 17:27:49,259 WARN hdfs.DFSClient (DFSInputStream.java:chooseDataNode(905)) - DFS chooseDataNode: got # 1 IOException, will wait for 429.349844455576 msec. 2014-07-04 17:27:49,692 WARN hdfs.DFSClient (DFSInputStream.java:chooseDataNode(905)) - DFS chooseDataNode: got # 2 IOException, will wait for 8001.653000168143 msec. 2014-07-04 17:27:57,697 WARN hdfs.DFSClient (DFSInputStream.java:chooseDataNode(905)) - DFS chooseDataNode: got # 3 IOException, will wait for 6062.193740931271 msec. I uploaded an minor patch to remove those unnecessary waiting. After applied HDFS-6614-addmium.txt, the result: grep will wait for hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/org.apache.hadoop.hdfs.TestPread-output.txt|wc -l 11 grep will wait for hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/org.apache.hadoop.hdfs.TestPread-output.txt |grep -v will wait for 0.0 msec|wc -l 0 shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6624) TestBlockTokenWithDFS#testEnd2End fails sometimes
[ https://issues.apache.org/jira/browse/HDFS-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051183#comment-14051183 ] Liang Xie commented on HDFS-6624: - It should be related with the chooseExcessReplicates behavior. From log: {code} 2014-07-02 21:23:30,083 INFO balancer.Balancer (Balancer.java:logNodes(969)) - 0 over-utilized: [] 2014-07-02 21:23:30,083 INFO balancer.Balancer (Balancer.java:logNodes(969)) - 1 above-average: [Source[127.0.0.1:55922, utilization=28.0]] 2014-07-02 21:23:30,083 INFO balancer.Balancer (Balancer.java:logNodes(969)) - 1 below-average: [BalancerDatanode[127.0.0.1:57889, utilization=16.0]] 2014-07-02 21:23:30,083 INFO balancer.Balancer (Balancer.java:logNodes(969)) - 0 underutilized: [] The cluster is balanced. Exiting... {code} and {code} 2014-07-02 21:23:35,413 INFO hdfs.TestBalancer (TestBalancer.java:runBalancer(381)) - Rebalancing with default ctor.--- will call waitForBalancer immediately then. {code} we can know that before it should be balanced already. because avgUtilization=0.2, so 0.28 - 0.2 BALANCE_ALLOWED_VARIANCE which is 0.11, | 0.16 - 0.2 | BALANCE_ALLOWED_VARIANCE. but once gone to waitForBalancer, even retry getting a new DN report many times, the new added DN had a small nodeUtilization:0.08, since |0.08 - 02| 0.11, so after retry then timeout then failed... From those log we know that node removed a couple of blocks after balancing: {code} 2014-07-02 21:23:30,136 INFO BlockStateChange (BlockManager.java:addToInvalidates(1074)) - BLOCK* addToInvalidates: blk_1073741840_1016 127.0.0.1:55922 127.0.0.1:57889 2014-07-02 21:23:30,136 INFO BlockStateChange (BlockManager.java:addToInvalidates(1074)) - BLOCK* addToInvalidates: blk_1073741841_1017 127.0.0.1:57889 2014-07-02 21:23:30,136 INFO BlockStateChange (BlockManager.java:addToInvalidates(1074)) - BLOCK* addToInvalidates: blk_1073741842_1018 127.0.0.1:57889 2014-07-02 21:23:34,305 INFO BlockStateChange (BlockManager.java:invalidateWorkForOneNode(3262)) - BLOCK* BlockManager: ask 127.0.0.1:57889 to delete [blk_1073741840_1016, blk_1073741841_1017, blk_1073741842_1018] {code} so the root cause is after balancing, the added block ops will trigger excessReplicates checking, then the removing will change the used space statistic, then failed the testing. Is there any quick setting for testing could bypass that checking? :) TestBlockTokenWithDFS#testEnd2End fails sometimes - Key: HDFS-6624 URL: https://issues.apache.org/jira/browse/HDFS-6624 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Andrew Wang Attachments: PreCommit-HDFS-Build #7274 test - testEnd2End [Jenkins].html On a recent test-patch.sh run, saw this error which did not repro locally: {noformat} Error Message Rebalancing expected avg utilization to become 0.2, but on datanode 127.0.0.1:57889 it remains at 0.08 after more than 4 msec. Stacktrace java.util.concurrent.TimeoutException: Rebalancing expected avg utilization to become 0.2, but on datanode 127.0.0.1:57889 it remains at 0.08 after more than 4 msec. at org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForBalancer(TestBalancer.java:284) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:382) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:359) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:403) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.integrationTest(TestBalancer.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS.testEnd2End(TestBlockTokenWithDFS.java:588) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6617) Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op
[ https://issues.apache.org/jira/browse/HDFS-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6617: Attachment: HDFS-6617-v2.txt Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op - Key: HDFS-6617 URL: https://issues.apache.org/jira/browse/HDFS-6617 Project: Hadoop HDFS Issue Type: Test Components: auto-failover, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6617-v2.txt, HDFS-6617.txt Just Hit a false alarm testing while working at HDFS-6614, see https://builds.apache.org/job/PreCommit-HDFS-Build/7259//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSZKFailoverController/testManualFailoverWithDFSHAAdmin/ After a looking at the log, shows the failure came from a timeout at ZKFailoverController.doCedeActive(): localTarget.getProxy(conf, timeout).transitionToStandby(createReqInfo()); While stopping active service, see FSNamesystem.stopActiveServices(): void stopActiveServices() { LOG.info(Stopping services started for active state); this corelates with the log: 2014-07-01 08:12:50,615 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1167)) - Stopping services started for active state then stopActiveServices will call editLog.close(), which goes to endCurrentLogSegment(), see log: 2014-07-01 08:12:50,616 INFO namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1216)) - Ending log segment 1 but this operation did not finish in 5 seconds, then triggered the timeout: 2014-07-01 08:12:55,624 WARN ha.ZKFailoverController (ZKFailoverController.java:doCedeActive(577)) - Unable to transition local node to standby: Call From asf001.sp2.ygridcore.net/67.195.138.31 to localhost:10021 failed on socket timeout exception: java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:53965 remote=localhost/127.0.0.1:10021]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout the logEdit/logSync finally done followed with printStatistics(true): 2014-07-01 08:13:05,243 INFO namenode.FSEditLog (FSEditLog.java:printStatistics(675)) - Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 14667 74 105 so obviously, this long sync contributed the timeout, maybe the QA box is very slow at that moment, so one possible fix here is setting the default fence timeout to a bigger one. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6617) Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op
[ https://issues.apache.org/jira/browse/HDFS-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051286#comment-14051286 ] Liang Xie commented on HDFS-6617: - [~cnauroth], the above suggestion should be better definitely, i made a patch v2, i confirmed this setting took effect with grep FSEditLog.java:printStatistics from log files w or w/o patch, so i am pretty sure it will fix the failure testing caused by the slow edit log sync operation:) please help to review, thank you! Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op - Key: HDFS-6617 URL: https://issues.apache.org/jira/browse/HDFS-6617 Project: Hadoop HDFS Issue Type: Test Components: auto-failover, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6617-v2.txt, HDFS-6617.txt Just Hit a false alarm testing while working at HDFS-6614, see https://builds.apache.org/job/PreCommit-HDFS-Build/7259//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSZKFailoverController/testManualFailoverWithDFSHAAdmin/ After a looking at the log, shows the failure came from a timeout at ZKFailoverController.doCedeActive(): localTarget.getProxy(conf, timeout).transitionToStandby(createReqInfo()); While stopping active service, see FSNamesystem.stopActiveServices(): void stopActiveServices() { LOG.info(Stopping services started for active state); this corelates with the log: 2014-07-01 08:12:50,615 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1167)) - Stopping services started for active state then stopActiveServices will call editLog.close(), which goes to endCurrentLogSegment(), see log: 2014-07-01 08:12:50,616 INFO namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1216)) - Ending log segment 1 but this operation did not finish in 5 seconds, then triggered the timeout: 2014-07-01 08:12:55,624 WARN ha.ZKFailoverController (ZKFailoverController.java:doCedeActive(577)) - Unable to transition local node to standby: Call From asf001.sp2.ygridcore.net/67.195.138.31 to localhost:10021 failed on socket timeout exception: java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:53965 remote=localhost/127.0.0.1:10021]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout the logEdit/logSync finally done followed with printStatistics(true): 2014-07-01 08:13:05,243 INFO namenode.FSEditLog (FSEditLog.java:printStatistics(675)) - Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 14667 74 105 so obviously, this long sync contributed the timeout, maybe the QA box is very slow at that moment, so one possible fix here is setting the default fence timeout to a bigger one. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051293#comment-14051293 ] Liang Xie commented on HDFS-6614: - bq. I did discover that the new testHedgedReadLoopTooManyTimes test is failing on my Windows VM on the assertion for the loop count Did your Windows VM run on a SSD or similar fast storage ? I think there's possibility that if the second read + the second creating block reader took less then 1ms, then our asserted loopTimeNum could be 2, not 3. Anyway, i assume changing assertEquals(3, input.getHedgedReadOpsLoopNumForTesting()); with assertTrue(input.getHedgedReadOpsLoopNumForTesting() =3 input.getHedgedReadOpsLoopNumForTesting() =2);, should be better. shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0, 2.5.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6627) getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess
Liang Xie created HDFS-6627: --- Summary: getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess Key: HDFS-6627 URL: https://issues.apache.org/jira/browse/HDFS-6627 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6627.txt Just read getReplicaVisibleLength() code and found it, DataNode.checkWriteAccess is only invoked by DataNode.getReplicaVisibleLength(), let's rename it to checkReadAccess to avoid confusing, since the real impl here is check AccessMode.READ: {code} blockPoolTokenSecretManager.checkAccess(id, null, block, BlockTokenSecretManager.AccessMode.READ); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6627) getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess
[ https://issues.apache.org/jira/browse/HDFS-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6627: Status: Patch Available (was: Open) getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess -- Key: HDFS-6627 URL: https://issues.apache.org/jira/browse/HDFS-6627 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6627.txt Just read getReplicaVisibleLength() code and found it, DataNode.checkWriteAccess is only invoked by DataNode.getReplicaVisibleLength(), let's rename it to checkReadAccess to avoid confusing, since the real impl here is check AccessMode.READ: {code} blockPoolTokenSecretManager.checkAccess(id, null, block, BlockTokenSecretManager.AccessMode.READ); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6627) getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess
[ https://issues.apache.org/jira/browse/HDFS-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6627: Attachment: HDFS-6627.txt getReplicaVisibleLength() should do checkReadAccess insteading of checkWriteAccess -- Key: HDFS-6627 URL: https://issues.apache.org/jira/browse/HDFS-6627 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6627.txt Just read getReplicaVisibleLength() code and found it, DataNode.checkWriteAccess is only invoked by DataNode.getReplicaVisibleLength(), let's rename it to checkReadAccess to avoid confusing, since the real impl here is check AccessMode.READ: {code} blockPoolTokenSecretManager.checkAccess(id, null, block, BlockTokenSecretManager.AccessMode.READ); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
Liang Xie created HDFS-6614: --- Summary: shorten TestPread run time with a smaller retry timeout setting Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6614: Attachment: HDFS-6614.txt shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6614: Status: Patch Available (was: Open) shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6617) Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op
Liang Xie created HDFS-6617: --- Summary: Flake TestDFSZKFailoverController.testManualFailoverWithDFSHAAdmin due to a long edit log sync op Key: HDFS-6617 URL: https://issues.apache.org/jira/browse/HDFS-6617 Project: Hadoop HDFS Issue Type: Test Components: auto-failover, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Hit a false alarm testing while working at HDFS-6614, see https://builds.apache.org/job/PreCommit-HDFS-Build/7259//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSZKFailoverController/testManualFailoverWithDFSHAAdmin/ After a deep looking at the logging, shows failure came from a timeout at ZKFailoverController.doCedeActive(): localTarget.getProxy(conf, timeout).transitionToStandby(createReqInfo()); While stopping active service, see FSNamesystem.stopActiveServices(): void stopActiveServices() { LOG.info(Stopping services started for active state); this is corelated with the log: 2014-07-01 08:12:50,615 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1167)) - Stopping services started for active state then stopActiveServices will call editLog.close(), which goes to endCurrentLogSegment(), see log: 2014-07-01 08:12:50,616 INFO namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1216)) - Ending log segment 1 but this operation did not done in 5 seconds, then triggered timeout: 2014-07-01 08:12:55,624 WARN ha.ZKFailoverController (ZKFailoverController.java:doCedeActive(577)) - Unable to transition local node to standby: Call From asf001.sp2.ygridcore.net/67.195.138.31 to localhost:10021 failed on socket timeout exception: java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:53965 remote=localhost/127.0.0.1:10021]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout the logEdit/logSync finally done with printStatistics(true): 2014-07-01 08:13:05,243 INFO namenode.FSEditLog (FSEditLog.java:printStatistics(675)) - Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 14667 74 105 so obviously, this long sync contributed the timeout, maybe the QA box is slow at that moment, so one possible fix here is set the default fence timeout to a bigger value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting
[ https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048742#comment-14048742 ] Liang Xie commented on HDFS-6614: - The failed case is not related with current change, filed HDFS-6617 to track it. shorten TestPread run time with a smaller retry timeout setting --- Key: HDFS-6614 URL: https://issues.apache.org/jira/browse/HDFS-6614 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Attachments: HDFS-6614.txt Just notice logs like this from TestPread: DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec so i tried to set a smaller retry window value. Before patch: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - in org.apache.hadoop.hdfs.TestPread After the change: T E S T S --- Running org.apache.hadoop.hdfs.TestPread Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - in org.apache.hadoop.hdfs.TestPread -- This message was sent by Atlassian JIRA (v6.2#6252)