Sahil Takiar created IMPALA-10072: ------------------------------------- Summary: Data load failures in ubuntu-16.04-from-scratch Key: IMPALA-10072 URL: https://issues.apache.org/jira/browse/IMPALA-10072 Project: IMPALA Issue Type: Bug Reporter: Sahil Takiar
Seems like there are consistent data load failures on several unrelated patches: [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11627/] [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11629/|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11629/#showFailuresLink] [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11631/|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11631/#showFailuresLink] [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11633/|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11633/#showFailuresLink] [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11635/] Almost all seem to be failing with an error like this: {code:java} 02:06:32 Loading nested parquet data (logging to /home/ubuntu/Impala/logs/data_loading/load-nested.log)... 02:08:06 FAILED (Took: 1 min 34 sec) 02:08:06 '/home/ubuntu/Impala/testdata/bin/load_nested.py -t tpch_nested_parquet -f parquet/none' failed. Tail of log: 02:08:06 at javax.security.auth.Subject.doAs(Subject.java:422) 02:08:06 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) 02:08:06 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) 02:08:06 02:08:06 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:220) 02:08:06 at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1361) 02:08:06 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:732) 02:08:06 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:756) 02:08:06 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:756) 02:08:06 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:756) 02:08:06 at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:471) 02:08:06 ... 17 more 02:08:06 Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpch_nested_parquet.db/.hive-staging_hive_2020-08-11_02-07-45_902_3668710725192096563-193/_task_tmp.-ext-10004/_tmp.000000_3 could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation. 02:08:06 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2259) 02:08:06 at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) 02:08:06 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2773) 02:08:06 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:879) 02:08:06 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:583) 02:08:06 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 02:08:06 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) 02:08:06 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) 02:08:06 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) 02:08:06 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) 02:08:06 at java.security.AccessController.doPrivileged(Native Method) 02:08:06 at javax.security.auth.Subject.doAs(Subject.java:422) 02:08:06 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) 02:08:06 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) 02:08:06 02:08:06 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) 02:08:06 at org.apache.hadoop.ipc.Client.call(Client.java:1499) 02:08:06 at org.apache.hadoop.ipc.Client.call(Client.java:1396) 02:08:06 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) 02:08:06 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) 02:08:06 at com.sun.proxy.$Proxy15.addBlock(Unknown Source) 02:08:06 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520) 02:08:06 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 02:08:06 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 02:08:06 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 02:08:06 at java.lang.reflect.Method.invoke(Method.java:498) 02:08:06 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) 02:08:06 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) 02:08:06 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) 02:08:06 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) 02:08:06 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) 02:08:06 at com.sun.proxy.$Proxy16.addBlock(Unknown Source) 02:08:06 at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1085) 02:08:06 at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1866) 02:08:06 at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668) 02:08:06 at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) 02:08:06 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1597109753535_0037_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 02:08:06 ERROR in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at line 48: 02:08:06 Generated: /home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.create-load-data.20200811_02_08_06.xml {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)