[jira] [Updated] (IMPALA-7122) Data load failure: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try
[ https://issues.apache.org/jira/browse/IMPALA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell updated IMPALA-7122: -- Fix Version/s: Not Applicable > Data load failure: Failed to replace a bad datanode on the existing pipeline > due to no more good datanodes being available to try > - > > Key: IMPALA-7122 > URL: https://issues.apache.org/jira/browse/IMPALA-7122 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Critical > Labels: flaky > Fix For: Not Applicable > > Attachments: data-load-functional-exhaustive.log, hdfs-logs.tar.gz, > impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587, > load-functional-query.log > > > {noformat} > 20:58:29 Started Loading functional-query data in background; pid 6813. > 20:58:29 Loading functional-query data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-functional-query.log)... > > 20:58:29 Started Loading TPC-H data in background; pid 6814. > 20:58:29 Loading TPC-H data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpch.log)... > > 20:58:29 Started Loading TPC-DS data in background; pid 6815. > 20:58:29 Loading TPC-DS data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)... > > 21:35:26 FAILED (Took: 36 min 57 sec) > 21:35:26 'load-data functional-query exhaustive' failed. Tail of log: > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086) > 21:35:26 18/06/04 21:20:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 0(DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 INFO hdfs.DataStreamer: Exception in > createBlockOutputStream blk_1073743620_2799 > 21:35:26 java.io.IOException: Got error, status=ERROR, status message , ack > with firstBadLink as 127.0.0.1:31002 > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667) > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 1(DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: DataStreamer Exception > 21:35:26 java.io.IOException: Failed to replace a bad datanode on the > existing pipeline due to no more good datanodes being available to try. > (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304) > 21:35:26 at >
[jira] [Updated] (IMPALA-7122) Data load failure: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try
[ https://issues.apache.org/jira/browse/IMPALA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7122: -- Attachment: hdfs-logs.tar.gz > Data load failure: Failed to replace a bad datanode on the existing pipeline > due to no more good datanodes being available to try > - > > Key: IMPALA-7122 > URL: https://issues.apache.org/jira/browse/IMPALA-7122 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Critical > Labels: flaky > Attachments: data-load-functional-exhaustive.log, hdfs-logs.tar.gz, > impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587, > load-functional-query.log > > > {noformat} > 20:58:29 Started Loading functional-query data in background; pid 6813. > 20:58:29 Loading functional-query data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-functional-query.log)... > > 20:58:29 Started Loading TPC-H data in background; pid 6814. > 20:58:29 Loading TPC-H data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpch.log)... > > 20:58:29 Started Loading TPC-DS data in background; pid 6815. > 20:58:29 Loading TPC-DS data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)... > > 21:35:26 FAILED (Took: 36 min 57 sec) > 21:35:26 'load-data functional-query exhaustive' failed. Tail of log: > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086) > 21:35:26 18/06/04 21:20:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 0(DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 INFO hdfs.DataStreamer: Exception in > createBlockOutputStream blk_1073743620_2799 > 21:35:26 java.io.IOException: Got error, status=ERROR, status message , ack > with firstBadLink as 127.0.0.1:31002 > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667) > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 1(DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: DataStreamer Exception > 21:35:26 java.io.IOException: Failed to replace a bad datanode on the > existing pipeline due to no more good datanodes being available to try. > (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1372) > 21:35:26 at
[jira] [Updated] (IMPALA-7122) Data load failure: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try
[ https://issues.apache.org/jira/browse/IMPALA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7122: -- Attachment: load-functional-query.log > Data load failure: Failed to replace a bad datanode on the existing pipeline > due to no more good datanodes being available to try > - > > Key: IMPALA-7122 > URL: https://issues.apache.org/jira/browse/IMPALA-7122 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Critical > Labels: flaky > Attachments: > impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587, > load-functional-query.log > > > {noformat} > 20:58:29 Started Loading functional-query data in background; pid 6813. > 20:58:29 Loading functional-query data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-functional-query.log)... > > 20:58:29 Started Loading TPC-H data in background; pid 6814. > 20:58:29 Loading TPC-H data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpch.log)... > > 20:58:29 Started Loading TPC-DS data in background; pid 6815. > 20:58:29 Loading TPC-DS data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)... > > 21:35:26 FAILED (Took: 36 min 57 sec) > 21:35:26 'load-data functional-query exhaustive' failed. Tail of log: > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086) > 21:35:26 18/06/04 21:20:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 0(DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 INFO hdfs.DataStreamer: Exception in > createBlockOutputStream blk_1073743620_2799 > 21:35:26 java.io.IOException: Got error, status=ERROR, status message , ack > with firstBadLink as 127.0.0.1:31002 > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667) > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 1(DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: DataStreamer Exception > 21:35:26 java.io.IOException: Failed to replace a bad datanode on the > existing pipeline due to no more good datanodes being available to try. > (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1372) > 21:35:26 at >
[jira] [Updated] (IMPALA-7122) Data load failure: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try
[ https://issues.apache.org/jira/browse/IMPALA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7122: -- Attachment: impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587 > Data load failure: Failed to replace a bad datanode on the existing pipeline > due to no more good datanodes being available to try > - > > Key: IMPALA-7122 > URL: https://issues.apache.org/jira/browse/IMPALA-7122 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Critical > Labels: flaky > Attachments: > impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587, > load-functional-query.log > > > {noformat} > 20:58:29 Started Loading functional-query data in background; pid 6813. > 20:58:29 Loading functional-query data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-functional-query.log)... > > 20:58:29 Started Loading TPC-H data in background; pid 6814. > 20:58:29 Loading TPC-H data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpch.log)... > > 20:58:29 Started Loading TPC-DS data in background; pid 6815. > 20:58:29 Loading TPC-DS data (logging to > /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)... > > 21:35:26 FAILED (Took: 36 min 57 sec) > 21:35:26 'load-data functional-query exhaustive' failed. Tail of log: > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086) > 21:35:26 18/06/04 21:20:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 0(DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 INFO hdfs.DataStreamer: Exception in > createBlockOutputStream blk_1073743620_2799 > 21:35:26 java.io.IOException: Got error, status=ERROR, status message , ack > with firstBadLink as 127.0.0.1:31002 > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) > 21:35:26 at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256) > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667) > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: Error Recovery for > BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline > [DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]: > datanode > 1(DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]) > is bad. > 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: DataStreamer Exception > 21:35:26 java.io.IOException: Failed to replace a bad datanode on the > existing pipeline due to no more good datanodes being available to try. > (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > 21:35:26 at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304) > 21:35:26 at >