[ https://issues.apache.org/jira/browse/IMPALA-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Ho resolved IMPALA-6316. -------------------------------- Resolution: Duplicate Fix Version/s: Not Applicable > impalad crashes after hadoopZeroCopyRead failure > ------------------------------------------------ > > Key: IMPALA-6316 > URL: https://issues.apache.org/jira/browse/IMPALA-6316 > Project: IMPALA > Issue Type: Bug > Affects Versions: Impala 2.11.0 > Reporter: Pranay Singh > Priority: Major > Fix For: Not Applicable > > > End- End tests fails > --------------------------- > 20:00:40 [gw0] PASSED > query_test/test_join_queries.py::TestJoinQueries::test_single_node_joins_with_limits_exhaustive[batch_size: > 1 | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > 20:04:17 > query_test/test_join_queries.py::TestJoinQueries::test_single_node_joins_with_limits_exhaustive[batch_size: > 1 | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > 20:04:17 [gw1] FAILED > query_test/test_queries.py::TestQueries::test_union[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: rc/snap/block] > 20:04:17 query_test/test_queries.py::TestQueries::test_union[exec_option: > {'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 100, 'batch_size': 0, > 'num_nodes': 0} | table_format: rc/snap/block] > 20:04:17 [gw2] FAILED > query_test/test_queries.py::TestQueries::test_subquery[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: seq/def/record] > 20:04:17 [gw3] FAILED > query_test/test_queries.py::TestQueries::test_analytic_fns[exec_option: > {'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 100, 'batch_size': 0, > 'num_nodes': 0} | table_format: seq/def/block] > 20:04:17 query_test/test_queries.py::TestQueries::test_subquery[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': True, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: seq/def/record] > 20:04:17 [gw0] FAILED > query_test/test_join_queries.py::TestJoinQueries::test_single_node_joins_with_limits_exhaustive[batch_size: > 1 | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > #0 0x00000031bea328e5 in raise () from /lib64/libc.so.6 > #1 0x00000031bea340c5 in abort () from /lib64/libc.so.6 > #2 0x0000000003be91a4 in google::DumpStackTraceAndExit() () > #3 0x0000000003bdfc1d in google::LogMessage::Fail() () > #4 0x0000000003be14c2 in google::LogMessage::SendToLog() () > #5 0x0000000003bdf5f7 in google::LogMessage::Flush() () > #6 0x0000000003be2bbe in google::LogMessageFatal::~LogMessageFatal() () > #7 0x000000000189390a in impala::FragmentInstanceState::Close > (this=0xc188ee0) at repos/Impala/be/src/runtime/fragment-instance-state.cc:315 > #8 0x0000000001890a12 in impala::FragmentInstanceState::Exec > (this=0xc188ee0) at repos/Impala/be/src/runtime/fragment-instance-state.cc:95 > #9 0x00000000018797b8 in impala::QueryState::ExecFInstance (this=0x20584000, > fis=0xc188ee0) at repos/Impala/be/src/runtime/query-state.cc:382 > #10 0x000000000187807a in impala::QueryState::<lambda()>::operator()(void) > const (__closure=0x7fc1fafd9bc8) at > repos/Impala/be/src/runtime/query-state.cc:325 > #11 0x000000000187a3f7 in > boost::detail::function::void_function_obj_invoker0<impala::QueryState::StartFInstances()::<lambda()>, > void>::invoke(boost::detail::function::function_buffer &) > (function_obj_ptr=...) at > Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 > #12 0x00000000017c6ed4 in boost::function0<void>::operator() > (this=0x7fc1fafd9bc0) at > Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 > #13 0x0000000001abdbc9 in impala::Thread::SuperviseThread (name=..., > category=..., functor=..., thread_started=0x7fc0cc476ab0) at > repos/Impala/be/src/util/thread.cc:352 > #14 0x0000000001ac6754 in > boost::_bi::list4<boost::_bi::value<std::basic_string<char, > std::char_traits<char>, std::allocator<char> > >, > boost::_bi::value<std::basic_string<char, std::char_traits<char>, > std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, > boost::_bi::value<impala::Promise<long int>*> >::operator()<void (*)(const > std::basic_string<char>&, const std::basic_string<char>&, > boost::function<void()>, impala::Promise<long int>*), > boost::_bi::list0>(boost::_bi::type<void>, void (*&)(const > std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, > const std::basic_string<char, std::char_traits<char>, std::allocator<char> > > &, boost::function<void()>, impala::Promise<long> *), boost::_bi::list0 &, > int) (this=0x1eec8f7c0, f=@0x1eec8f7b8, a=...) at > workspace/impala-cdh5-trunk-exhaustive/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:457 > #15 0x0000000001ac6697 in boost::_bi::bind_t<void, void (*)(const > std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, > const std::basic_string<char, std::char_traits<char>, std::allocator<char> > >&, boost::function<void()>, impala::Promise<long int>*), > boost::_bi::list4<boost::_bi::value<std::basic_string<char, > std::char_traits<char>, std::allocator<char> > >, > boost::_bi::value<std::basic_string<char, std::char_traits<char>, > std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, > boost::_bi::value<impala::Promise<long int>*> > >::operator()(void) > (this=0x1eec8f7b8) at > workspace/impala-cdh5-trunk-exhaustive/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20 > #16 0x0000000001ac665a in boost::detail::thread_data<boost::_bi::bind_t<void, > void (*)(const std::basic_string<char, std::char_traits<char>, > std::allocator<char> >&, const std::basic_string<char, > std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, > impala::Promise<long int>*), > boost::_bi::list4<boost::_bi::value<std::basic_string<char, > std::char_traits<char>, std::allocator<char> > >, > boost::_bi::value<std::basic_string<char, std::char_traits<char>, > std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, > boost::_bi::value<impala::Promise<long int>*> > > >::run(void) > (this=0x1eec8f600) at > workspace/impala-cdh5-trunk-exhaustive/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:116 > #17 0x0000000002d6966a in thread_proxy () > #18 0x00000031bee07851 in start_thread () from /lib64/libpthread.so.0 > #19 0x00000031beae894d in clone () from /lib64/libc.so.6 > log traces when this happened from impalad.INFO > -------------------------------------------------------------------- > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > E1208 15:03:55.125169 2169 Analyzer.java:2375] Failed to load metadata for > table: alltypes > Failed to load metadata for table: functional.alltypes. Running 'invalidate > metadata functional.alltypes' may resolve this problem. > CAUSED BY: MetaException: Could not connect to meta store using any of the > URIs provided. Most recent failure: > org.apache.thrift.transport.TTransportException: java.net.ConnectException: > Connection refused > at org.apache.thrift.transport.TSocket.open(TSocket.java:226) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:472) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.reconnect(HiveMetaStoreClient.java:337) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:98) > at com.sun.proxy.$Proxy5.getTable(Unknown Source) > at org.apache.impala.catalog.TableLoader.load(TableLoader.java:65) > at > org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241) > at > org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:238) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:579) > at org.apache.thrift.transport.TSocket.open(TSocket.java:221) > ... 11 more > Picked up JAVA_TOOL_OPTIONS: > -agentlib:jdwp=transport=dt_socket,address=30000,server=y,suspend=n > hdfsOpenFile(hdfs://localhost:20500/test-warehouse/file_open_fail/564e4332cbb6e8de-c0c5101c00000000_2005391775_data.0.): > > FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) > error: > RemoteException: File does not exist: > /test-warehouse/file_open_fail/564e4332cbb6e8de-c0c5101c00000000_2005391775_data.0. > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2100) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2070) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1983) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:579) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:92) > . > . > . > FSDataOutputStream#close error: > RemoteException: No lease on > /test-warehouse/tpch_parquet.db/ctas_cancel/_impala_insert_staging/a14b1ee198cd7327_a46f833a00000000/.a14b1ee198cd7327-a46f833a00000002_567821133_dir/a14b1ee198cd7327-a46f833a00000002_1272243926_data.0.parq > (inode 37350): File does not exist. Holder > DFSClient_NONMAPREDUCE_307426671_1 does not have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3760) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3561) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3417) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:690) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275) > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /test-warehouse/tpch_parquet.db/ctas_cancel/_impala_insert_staging/a14b1ee198cd7327_a46f833a00000000/.a14b1ee198cd7327-a46f833a00000002_567821133_dir/a14b1ee198cd7327-a46f833a00000002_1272243926_data.0.parq > (inode 37350): File does not exist. Holder > DFSClient_NONMAPREDUCE_307426671_1 does not have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3760) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3561) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3417) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:690) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGr. > . > . > FSDataOutputStream#close error: > RemoteException: No lease on > /test-warehouse/functional_parquet.db/alltypesinsert/_impala_insert_staging/3949d68930d0228e_c655177500000000/.3949d68930d0228e-c655177500000008_357133657_dir/year=2009/month=0/3949d68930d0228e-c655177500000008_24906809_data.0.parq > (inode 88180): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_307426671_1, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3760) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3561) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3417) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:690) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$E1208 17:14:10.009800 > 15893 LiteralExpr.java:186] Failed to evaluate expr 'space(1073741830)' > tcmalloc: large alloc 2147483648 bytes == 0x2708a6000 @ 0x3d039c6 > 0x7fc28438ac49 > tcmalloc: large alloc 4294967296 bytes == 0x7fc0dd294000 @ 0x3d039c6 > 0x7fc28438ac49 > E1208 17:17:29.387645 15893 LiteralExpr.java:186] Failed to evaluate expr > 'space(1073741830)' > E1208 17:17:30.425915 15893 LiteralExpr.java:186] Failed to evaluate expr > 'space(1073741830)' > E1208 17:18:41.971148 15893 LiteralExpr.java:186] Failed to evaluate expr > 'space(1073741830)' > tcmalloc: large alloc 4294967296 bytes == 0x7fc0dd294000 @ 0x3d039c6 > 0x7fc28438ac49 > E1208 17:21:30.161092 15893 LiteralExpr.java:186] Failed to evaluate expr > 'space(1073741830)' > E1208 17:21:30.913319 15893 LiteralExpr.java:186] Failed to evaluate expr > 'space(1073741830)' > E1208 17:25:00.198657 18963 LiteralExpr.java:186] Failed to evaluate expr > 'test_mem_limits_978e0f35.memtest(10485760)' > E1208 17:25:00.199533 18963 LiteralExpr.java:186] Failed to evaluate expr > 'test_mem_limits_978e0f35.memtest(10485760)' > E1208 17:25:00.200562 18963 LiteralExpr.java:186] Failed to evaluate expr > 'test_mem_limits_978e0f35.memtest(10485760)' > E1208 17:25:08.581363 18963 LiteralExpr.java:186] Failed to evaluate expr > 'test_mem_limits_ae6bd38e.memtest(10485760)' > . > . > E1208 19:13:11.192692 7603 LiteralExpr.java:186] Failed to evaluate expr > 'TIMESTAMP '1400-01-01 21:00:00' - INTERVAL 1 DAYS' > E1208 19:13:11.224931 7603 LiteralExpr.java:186] Failed to evaluate expr > 'TIMESTAMP '1400-01-01 21:00:00' - INTERVAL 1 DAYS' > . > . > hadoopZeroCopyRead: ZeroCopyCursor#read failed error: > ReadOnlyBufferException: java.nio.ReadOnlyBufferException > at java.nio.DirectByteBufferR.put(DirectByteBufferR.java:344) > at > org.apache.hadoop.crypto.CryptoInputStream.decrypt(CryptoInputStream.java:53F1208 > 20:00:45.213917 25256 fragment-instance-state.cc:315] Check failed: > other_time <= total_time + 1 (481986958 vs. 481986956) > *** Check failure stack trace: *** > @ 0x3bdfc1d google::LogMessage::Fail() > @ 0x3be14c2 google::LogMessage::SendToLog() > @ 0x3bdf5f7 google::LogMessage::Flush() > @ 0x3be2bbe google::LogMessageFatal::~LogMessageFatal() > @ 0x189390a impala::FragmentInstanceState::Close() > @ 0x1890a12 impala::FragmentInstanceState::Exec() > @ 0x18797b8 impala::QueryState::ExecFInstance() > @ 0x187807a > _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv > @ 0x187a3f7 > _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE > @ 0x17c6ed4 boost::function0<>::operator()() > @ 0x1abdbc9 impala::Thread::SuperviseThread() > @ 0x1ac6754 boost::_bi::list4<>::operator()<>() > @ 0x1ac6697 boost::_bi::bind_t<>::operator()() > @ 0x1ac665a boost::detail::thread_data<>::run() > @ 0x2d6966a thread_proxy > @ 0x31bee07851 (unknown) > @ 0x31beae894d (unknown) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org