lichao liu created HDFS-9622: -------------------------------- Summary: get Block Locations is always unstable Key: HDFS-9622 URL: https://issues.apache.org/jira/browse/HDFS-9622 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Environment: CDH5.5.0 Reporter: lichao liu
query speed is slow in Impala,I am using CDH5.5.0 I monitor the backstage implala log, found time-consuming long query background are abnormal, as follows: Tuple(id=0 size=40 slots=[Slot(id=0 type=STRING col_path=[4] offset=24 null=(offset=0 mask=4) slot_idx=2 field_idx=-1), Slot(id=1 type=BIGINT col_path=[5] offset=8 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=2 type=BIGINT col_path=[6] offset=16 null=(offset=0 mask=2) slot_idx=1 field_idx=-1), Slot(id=3 type=STRING col_path=[0] offset=-1 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=4 type=STRING col_path=[1] offset=-1 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=5 type=STRING col_path=[2] offset=-1 null=(offset=0 mask=1) slot_idx=0 field_idx=-1)] tuple_path=[]) Tuple(id=1 size=40 slots=[Slot(id=6 type=STRING col_path=[] offset=24 null=(offset=0 mask=4) slot_idx=2 field_idx=-1), Slot(id=7 type=BIGINT col_path=[] offset=8 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=8 type=BIGINT col_path=[] offset=16 null=(offset=0 mask=2) slot_idx=1 field_idx=-1)] tuple_path=[]) Tuple(id=2 size=40 slots=[Slot(id=9 type=STRING col_path=[] offset=24 null=(offset=0 mask=4) slot_idx=2 field_idx=-1), Slot(id=10 type=BIGINT col_path=[] offset=8 null=(offset=0 mask=1) slot_idx=0 field_idx=-1), Slot(id=11 type=BIGINT col_path=[] offset=16 null=(offset=0 mask=2) slot_idx=1 field_idx=-1)] tuple_path=[]) I0106 09:46:59.656497 19278 plan-fragment-executor.cc:303] Open(): instance_id=794f58dadaa44cb8:1f24c33dda8d00a2 I0106 09:47:20.070286 6805 RetryInvocationHandler.java:144] Exception while invoking getBlockLocations of class ClientNamenodeProtocolTranslatorPB over namenode1:8020. Trying to fail over immediately. Java exception follows: org.apache.hadoop.net.ConnectTimeoutException: Call From datanode to namenode1:8020 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending namenode1:8020]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout at sun.reflect.GeneratedConstructorAccessor7.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1403) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLojavascript:;cations(ClientNamenodeProtocolTranslatorPB.java:254) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1258) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1245) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1233) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:302) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:268) at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:260) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1564) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:308) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:304) Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=namenode2:8020] at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:708) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) at org.apache.hadoop.ipc.Client.call(Client.java:1442) ... 21 more I0106 09:47:20.077205 6805 RetryInvocationHandler.java:144] Exception while invoking getBlockLocations of class ClientNamenodeProtocolTranslatorPB over namenode2:8020 after 1 fail over attempts. Trying to fail over after sleeping for 1300ms. Java exception follows: The datanode can't stable block position information -- This message was sent by Atlassian JIRA (v6.3.4#6332)