[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621358#comment-17621358 ] Qifan Chen commented on IMPALA-11653: - It seems a viable solution would be to augment connection_setup_pool which is a ThreadPool> as follows. 1. Allow more numbers of TAcceptQueueEntry in it. Say accepted_cnxn_setup_thread_pool_size + 10. 2. Record the wait time each TAcceptQueueEntry in the queue; 3. Kick out entries with longest waiting time when new member is to be queued. For each kicked out entry, do a close() on it; 4. Remove entries in the queue after SetupConnection() completes normally so only the "super long waiting" items are in the queue. > Identify and time out connections that are not from a supported Impala client > more eagerly > -- > > Key: IMPALA-11653 > URL: https://issues.apache.org/jira/browse/IMPALA-11653 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.1.0 >Reporter: Vincent Tran >Assignee: Qifan Chen >Priority: Major > Attachments: simple_tcp_client.py > > > When a tcp client opens a connection to an Impala client interface (hs2 or > beeswax), the connection is accepted immediately after the 3-way handshake > (SYN, SYN-ACK, ACK) and is queued for > *TAcceptQueueServer::SetupConnection()*. However, if the client sends > nothing else, the ImpalaServer will block in > *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN > or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 > minutes). > The connection setup thread stack trace can be observed below during this > period. > {noformat} > (gdb) bt > #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 > #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned > char*, unsigned int) () > #2 0x02dd1803 in unsigned int > apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, > unsigned char*, unsigned int) () > #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", > this=) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 > #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage > (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, > length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 > #5 0x0132db14 in > apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage > (this=0x278a96b0) at TSaslServerTransport.cpp:95 > #6 0x01330e33 in > apache::thrift::transport::TSaslTransport::doSaslNegotiation > (this=0x278a96b0) at TSaslTransport.cpp:81 > #7 0x0132e723 in open (this=0x12e29750) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218 > #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport > (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 > #9 0x010cd49d in > apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, > entry=...) at TAcceptQueueServer.cpp:233 > #10 0x010cef4d in operator() (tid=, item=..., > __closure=) at TAcceptQueueServer.cpp:323 > #11 > boost::detail::function::void_function_obj_invoker2 const boost::shared_ptr&)>, void, > int, const > boost::shared_ptr&>::invoke(boost::detail::function::function_buffer > &, int, const boost::shared_ptr > &) (function_obj_ptr=..., a0=, a1=...) > at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159 > #12 0x010d3e59 in operator() (a1=..., a0=1, this=0x7f3279ea9510) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #13 > impala::ThreadPool > >::WorkerThread (this=0x7f3279ea94c0, thread_id=1) at > ../util/thread-pool.h:166 > #14 0x0144f8f2 in operator() (this=0x7f3277ea5b40) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #15 impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) (name=..., category=..., > functor=..., parent_thread_info=, > thread_started=0x7f3279ea9110) at thread.cc:360 > #16 0x01450d6b in operator() std::__cxx11::basic_string&, const std::__cxx11::basic_string&, > boost::function, const impala::ThreadDebugInfo*, impala::Promise int>*), boost::_bi::list0> (a=, > f=@0x1417ccf8: 0x144f5f0 > std::char_traits, std::allocator > const&, >
[jira] [Comment Edited] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621353#comment-17621353 ] Qifan Chen edited comment on IMPALA-11653 at 10/20/22 8:17 PM: --- Before the very first communication with the client, impalad needs to setup the connection by calling SetupConnection(). It is in this setup method that the above "super long wait" can happen. {code:java} (gdb) bt #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned char*, unsigned int) () #2 0x02dd1803 in unsigned int apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, unsigned char*, unsigned int) () #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", this=) at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 #5 0x0132db14 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage (this=0x278a96b0) at TSaslServerTransport.cpp:95 #6 0x01330e33 in apache::thrift::transport::TSaslTransport::doSaslNegotiation (this=0x278a96b0) at TSaslTransport.cpp:81 #7 0x0132e723 in open (this=0x12e29750) at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218. void open() { transport_->open(); } 132 void TSaslTransport::open() { 133 // Only client should open the underlying transport. 134 if (isClient_ && !transport_->isOpen()) { 135 transport_->open(); 136 } 137 138 // Start the SASL negotiation protocol. 139 doSaslNegotiation(); 140 } #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 #9 0x010cd49d in apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, entry=...) at TAcceptQueueServer.cpp:233 #10 0x010cef4d in operator() {code} {code:java} 130 std::shared_ptr TSaslServerTransport::Factory::getTransport( 131 std::shared_ptr trans) { 132 // Thrift servers use both an input and an output transport to communicate with 133 // clients. In principal, these can be different, but for SASL clients we require them 134 // to be the same so that the authentication state is identical for communication in 135 // both directions. In order to do this, we share the same TTransport object for both 136 // input and output set in TAcceptQueueServer::SetupConnection. 137 std::shared_ptr ret_transport; 138 std::shared_ptr wrapped( 139 new TSaslServerTransport(serverDefinitionMap_, trans)); 140 // Set socket timeouts to prevent TSaslServerTransport->open from blocking the server 141 // from accepting new connections if a read/write blocks during the handshake 142 TSocket* socket = static_cast(trans.get()); 143 socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms);<== 5min timeout for read() calls invoked indirectly at line 147: open() 144 socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms); 145 ret_transport.reset(new TBufferedTransport(wrapped, 146 impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES)); 147 ret_transport.get()->open(); 148 // Reset socket timeout back to zero, so idle clients do not timeout 149 socket->setRecvTimeout(0); 150 socket->setSendTimeout(0); 151 return ret_transport; 152 }
[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621353#comment-17621353 ] Qifan Chen commented on IMPALA-11653: - Before the very first communication with the client, impalad needs to setup the connection by calling SetupConnection(). It is in this setup method that the above "infinite wait" can happen. {code:java} (gdb) bt #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned char*, unsigned int) () #2 0x02dd1803 in unsigned int apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, unsigned char*, unsigned int) () #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", this=) at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 #5 0x0132db14 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage (this=0x278a96b0) at TSaslServerTransport.cpp:95 #6 0x01330e33 in apache::thrift::transport::TSaslTransport::doSaslNegotiation (this=0x278a96b0) at TSaslTransport.cpp:81 #7 0x0132e723 in open (this=0x12e29750) at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218. void open() { transport_->open(); } 132 void TSaslTransport::open() { 133 // Only client should open the underlying transport. 134 if (isClient_ && !transport_->isOpen()) { 135 transport_->open(); 136 } 137 138 // Start the SASL negotiation protocol. 139 doSaslNegotiation(); 140 } #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 #9 0x010cd49d in apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, entry=...) at TAcceptQueueServer.cpp:233 #10 0x010cef4d in operator() {code} {code:java} 130 std::shared_ptr TSaslServerTransport::Factory::getTransport( 131 std::shared_ptr trans) { 132 // Thrift servers use both an input and an output transport to communicate with 133 // clients. In principal, these can be different, but for SASL clients we require them 134 // to be the same so that the authentication state is identical for communication in 135 // both directions. In order to do this, we share the same TTransport object for both 136 // input and output set in TAcceptQueueServer::SetupConnection. 137 std::shared_ptr ret_transport; 138 std::shared_ptr wrapped( 139 new TSaslServerTransport(serverDefinitionMap_, trans)); 140 // Set socket timeouts to prevent TSaslServerTransport->open from blocking the server 141 // from accepting new connections if a read/write blocks during the handshake 142 TSocket* socket = static_cast(trans.get()); 143 socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms);<== 5min timeout for read() calls invoked indirectly at line 147: open() 144 socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms); 145 ret_transport.reset(new TBufferedTransport(wrapped, 146 impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES)); 147 ret_transport.get()->open(); 148 // Reset socket timeout back to zero, so idle clients do not timeout 149 socket->setRecvTimeout(0); 150 socket->setSendTimeout(0); 151 return ret_transport; 152 } 153
[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621349#comment-17621349 ] Qifan Chen commented on IMPALA-11653: - Here is the client side code where the existence of Kerberos is picked up. {code:java} 967 self.close_connection() 968 self.imp_client = self._new_impala_client() 969 self._connect() 970 # If the connection fails and the Kerberos has not been enabled, 971 # check for a valid kerberos ticket and retry the connection 972 # with kerberos enabled. 973 if not self.imp_client.connected and not self.use_kerberos: 974 try: 975 if call(["klist", "-s"]) == 0: 976 print("Kerberos ticket found in the credentials cache, retrying " 977 "the connection with a secure transport.", file=sys.stderr) 978 self.use_kerberos = True 979 self.use_ldap = False 980 self.ldap_password = None 981 self.imp_client = self._new_impala_client() 982 self._connect() 983 except OSError: 984 pass shell/impala_shell.py {code} > Identify and time out connections that are not from a supported Impala client > more eagerly > -- > > Key: IMPALA-11653 > URL: https://issues.apache.org/jira/browse/IMPALA-11653 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.1.0 >Reporter: Vincent Tran >Assignee: Qifan Chen >Priority: Major > Attachments: simple_tcp_client.py > > > When a tcp client opens a connection to an Impala client interface (hs2 or > beeswax), the connection is accepted immediately after the 3-way handshake > (SYN, SYN-ACK, ACK) and is queued for > *TAcceptQueueServer::SetupConnection()*. However, if the client sends > nothing else, the ImpalaServer will block in > *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN > or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 > minutes). > The connection setup thread stack trace can be observed below during this > period. > {noformat} > (gdb) bt > #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 > #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned > char*, unsigned int) () > #2 0x02dd1803 in unsigned int > apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, > unsigned char*, unsigned int) () > #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", > this=) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 > #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage > (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, > length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 > #5 0x0132db14 in > apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage > (this=0x278a96b0) at TSaslServerTransport.cpp:95 > #6 0x01330e33 in > apache::thrift::transport::TSaslTransport::doSaslNegotiation > (this=0x278a96b0) at TSaslTransport.cpp:81 > #7 0x0132e723 in open (this=0x12e29750) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218 > #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport > (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 > #9 0x010cd49d in > apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, > entry=...) at TAcceptQueueServer.cpp:233 > #10 0x010cef4d in operator() (tid=, item=..., > __closure=) at TAcceptQueueServer.cpp:323 > #11 > boost::detail::function::void_function_obj_invoker2 const boost::shared_ptr&)>, void, > int, const >
[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621347#comment-17621347 ] Qifan Chen commented on IMPALA-11653: - It seems the bug in question takes place in the following method. The code at line 143 sets a timeout of 5min for the open() at line 147. If the timeout is set to say 1min instead, then a legitimate client with slow SASL handshaking would be kicked out prematurely. {code:java} 130 std::shared_ptr TSaslServerTransport::factory::getTransport( 131 std::shared_ptr trans) { 132 // Thrift servers use both an input and an output transport to communicate with 133 // clients. In principal, these can be different, but for SASL clients we require them 134 // to be the same so that the authentication state is identical for communication in 135 // both directions. In order to do this, we share the same TTransport object for both 136 // input and output set in TAcceptQueueServer::SetupConnection. 137 std::shared_ptr ret_transport; 138 std::shared_ptr wrapped( 139 new TSaslServerTransport(serverDefinitionMap_, trans)); 140 // Set socket timeouts to prevent TSaslServerTransport->open from blocking the server 141 // from accepting new connections if a read/write blocks during the handshake 142 TSocket* socket = static_cast(trans.get()); 143 socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms); 144 socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms); 145 ret_transport.reset(new TBufferedTransport(wrapped, 146 impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES)); 147 ret_transport.get()->open(); 148 // Reset socket timeout back to zero, so idle clients do not timeout 149 socket->setRecvTimeout(0); 150 socket->setSendTimeout(0); 151 return ret_transport; 152 } {code} > Identify and time out connections that are not from a supported Impala client > more eagerly > -- > > Key: IMPALA-11653 > URL: https://issues.apache.org/jira/browse/IMPALA-11653 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.1.0 >Reporter: Vincent Tran >Assignee: Qifan Chen >Priority: Major > Attachments: simple_tcp_client.py > > > When a tcp client opens a connection to an Impala client interface (hs2 or > beeswax), the connection is accepted immediately after the 3-way handshake > (SYN, SYN-ACK, ACK) and is queued for > *TAcceptQueueServer::SetupConnection()*. However, if the client sends > nothing else, the ImpalaServer will block in > *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN > or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 > minutes). > The connection setup thread stack trace can be observed below during this > period. > {noformat} > (gdb) bt > #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 > #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned > char*, unsigned int) () > #2 0x02dd1803 in unsigned int > apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, > unsigned char*, unsigned int) () > #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", > this=) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 > #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage > (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, > length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 > #5 0x0132db14 in > apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage > (this=0x278a96b0) at TSaslServerTransport.cpp:95 > #6 0x01330e33 in > apache::thrift::transport::TSaslTransport::doSaslNegotiation > (this=0x278a96b0) at TSaslTransport.cpp:81 > #7 0x0132e723 in open (this=0x12e29750) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218 > #8
[jira] [Comment Edited] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type
[ https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584 ] Qifan Chen edited comment on IMPALA-11665 at 10/19/22 8:39 PM: --- Setup a table with nulls and empty strings in the STRING column null_str. When loading the table, configured the table with 1 page and 3 pages. Ran the query in DML section below and observed the following when the fast code path was taken. 1. Nulls are not part of the page min/max stats and min/max filter stats at all, which is good; 2. The runtime filtering works as designed. DDL {code:java} create table null_pq ( id string, null_str string, null_int int ) sort by (null_str) stored as parquet ; {code} data loading: {code:java} set PARQUET_PAGE_ROW_COUNT_LIMIT=12; insert into null_pq values ('a', null, 1), ('b', null, 2), ('c',null,3), ('aa', 'a', 1), ('ab', 'b', 2), ('ac','c',3), ('ad', '', 4), ('ae', '', 5), ('ac','',6); {code} 1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12) {code:java} [14:11:06 qchen@qifan-10229: src] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq 22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY null_str: BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]... null_int: INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 BINARY id *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:ad value 2: R:0 D:1 V:ae value 3: R:0 D:1 V:ac value 4: R:0 D:1 V:aa value 5: R:0 D:1 V:ab value 6: R:0 D:1 V:ac value 7: R:0 D:1 V:a value 8: R:0 D:1 V:b value 9: R:0 D:1 V:c BINARY null_str *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V: value 2: R:0 D:1 V: value 3: R:0 D:1 V: value 4: R:0 D:1 V:a value 5: R:0 D:1 V:b value 6: R:0 D:1 V:c value 7: R:0 D:0 V: value 8: R:0 D:0 V: value 9: R:0 D:0 V: INT32 null_int *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:4 value 2: R:0 D:1 V:5 value 3: R:0 D:1 V:6 value 4: R:0 D:1 V:1 value 5: R:0 D:1 V:2 value 6: R:0 D:1 V:3 value 7: R:0 D:1 V:1 value 8: R:0 D:1 V:2 value 9: R:0 D:1 V:3 [14:23:16 qchen@qifan-10229: src] {code} 3 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4) {code:java} pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq [13:50:22 qchen@qifan-10229: cluster] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq 22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]... null_str: BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]... null_int: INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:1 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat [more]... VC:1 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY
[jira] [Comment Edited] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type
[ https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584 ] Qifan Chen edited comment on IMPALA-11665 at 10/19/22 8:39 PM: --- Setup a table with nulls and empty strings in the STRING column null_str. When loading the table, configured the table with 1 page and 3 pages. Ran the query in DML section below and observed the following when the fast code path was taken. 1. Nulls are not part of the page min/max stats and min/max filter stats at all, which is good; 2. The runtime filtering works as designed. DDL {code:java} create table null_pq ( id string, null_str string, null_int int ) sort by (null_str) stored as parquet ; {code} data loading: {code:java} set PARQUET_PAGE_ROW_COUNT_LIMIT=12; insert into null_pq values ('a', null, 1), ('b', null, 2), ('c',null,3), ('aa', 'a', 1), ('ab', 'b', 2), ('ac','c',3), ('ad', '', 4), ('ae', '', 5), ('ac','',6); {code} 1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12) {code:java} [14:11:06 qchen@qifan-10229: src] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq 22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY null_str: BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]... null_int: INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 BINARY id *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:ad value 2: R:0 D:1 V:ae value 3: R:0 D:1 V:ac value 4: R:0 D:1 V:aa value 5: R:0 D:1 V:ab value 6: R:0 D:1 V:ac value 7: R:0 D:1 V:a value 8: R:0 D:1 V:b value 9: R:0 D:1 V:c BINARY null_str *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V: value 2: R:0 D:1 V: value 3: R:0 D:1 V: value 4: R:0 D:1 V:a value 5: R:0 D:1 V:b value 6: R:0 D:1 V:c value 7: R:0 D:0 V: value 8: R:0 D:0 V: value 9: R:0 D:0 V: INT32 null_int *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:4 value 2: R:0 D:1 V:5 value 3: R:0 D:1 V:6 value 4: R:0 D:1 V:1 value 5: R:0 D:1 V:2 value 6: R:0 D:1 V:3 value 7: R:0 D:1 V:1 value 8: R:0 D:1 V:2 value 9: R:0 D:1 V:3 [14:23:16 qchen@qifan-10229: src] {code} 3 pages case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4) {code:java} pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq [13:50:22 qchen@qifan-10229: cluster] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq 22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]... null_str: BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]... null_int: INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:1 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat [more]... VC:1 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY
[jira] [Commented] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type
[ https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620587#comment-17620587 ] Qifan Chen commented on IMPALA-11665: - It may be helpful to obtain the parquet data file(s) involved in the crash and to try the offending query afterwards. > Min/Max filter could crash in fast code path for string data type > - > > Key: IMPALA-11665 > URL: https://issues.apache.org/jira/browse/IMPALA-11665 > Project: IMPALA > Issue Type: Bug >Reporter: Abhishek Rawat >Assignee: Qifan Chen >Priority: Critical > > The impalad logs show that memcmp failed due to a segfault: > {code:java} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f0396c3ff22, pid=1, tid=0x7f023f365700 > # > # JRE version: OpenJDK Runtime Environment (8.0_332-b09) (build 1.8.0_332-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.332-b09 mixed mode linux-amd64 > compressed oops) > # Problematic frame: > # C [libc.so.6+0x16af22] __memcmp_sse4_1+0xd42 {code} > Resolved Stack Trace for the crashed thread: > {code:java} > Thread 530 (crashed) > 0 libc-2.17.so + 0x16af22 > rax = 0x7f61567715f0 rdx = 0x000a > rcx = 0x7f62ae04cf22 rbx = 0x > rsi = 0x5d1e900a rdi = 0x000a > rbp = 0x7f6156771560 rsp = 0x7f6156771548 > r8 = 0x034d40f0 r9 = 0x7f62ae022e90 > r10 = 0x0498ff6c r11 = 0x7f62ae06f590 > r12 = 0x000a r13 = 0x1a9678e8 > r14 = 0x7f6156771730 r15 = 0x01b1f380 > rip = 0x7f62ae04cf22 > Found by: given as instruction pointer in context > 1 > impalad!impala::HdfsParquetScanner::CollectSkippedPageRangesForSortedColumn(impala::MinMaxFilter > const*, impala::ColumnType const&, > std::vector, > std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, > std::vector, > std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, int, int, > std::vector >*) > [hdfs-parquet-scanner.cc : 1388 + 0x3] > rbp = 0x7f6156771650 rsp = 0x7f6156771570 > rip = 0x01b10305 > Found by: previous frame's frame pointer > 2 impalad!impala::HdfsParquetScanner::SkipPagesBatch(parquet::RowGroup&, > impala::ColumnStatsReader const&, parquet::ColumnIndex const&, int, int, > impala::ColumnType const&, int, parquet::ColumnChunk const&, > impala::MinMaxFilter const*, std::vector std::allocator >*, int*) [hdfs-parquet-scanner.cc : 1230 + > 0x34] > rbx = 0x7f61567716f0 rbp = 0x7f61567717e0 > rsp = 0x7f6156771660 r12 = 0x7f6156771710 > r13 = 0x7f6156771950 r14 = 0x1a9678e8 > r15 = 0x7f6156771920 rip = 0x01b14838 > Found by: call frame info > 3 > impalad!impala::HdfsParquetScanner::FindSkipRangesForPagesWithMinMaxFilters(std::vector std::allocator >*) [hdfs-parquet-scanner.cc : 1528 + 0x57] > rbx = 0x004a rbp = 0x7f6156771b10 > rsp = 0x7f61567717f0 r12 = 0x2c195800 > r13 = 0x2aa115d0 r14 = 0x0001 > r15 = 0x0049 rip = 0x01b1cf1a > Found by: call frame info > 4 impalad!impala::HdfsParquetScanner::EvaluatePageIndex() > [hdfs-parquet-scanner.cc : 1600 + 0x19] > rbx = 0x7f6156771c30 rbp = 0x7f6156771cf0 > rsp = 0x7f6156771b20 r12 = 0x2c195800 > r13 = 0x7f6156771de8 r14 = 0x104528a0 > r15 = 0x7f6156771df0 rip = 0x01b1d9dd > Found by: call frame info > 5 impalad!impala::HdfsParquetScanner::ProcessPageIndex() > [hdfs-parquet-scanner.cc : 1318 + 0xb] > rbx = 0x2c195800 rbp = 0x7f6156771d70 > rsp = 0x7f6156771d00 r12 = 0x7f6156771d10 > r13 = 0x7f6156771de8 r14 = 0x104528a0 > r15 = 0x7f6156771df0 rip = 0x01b1dd0b > Found by: call frame info > 6 impalad!impala::HdfsParquetScanner::NextRowGroup() > [hdfs-parquet-scanner.cc : 934 + 0xf] > rbx = 0x318ce040 rbp = 0x7f6156771e40 > rsp = 0x7f6156771d80 r12 = 0x2c195800 > r13 = 0x7f6156771de8 r14 = 0x104528a0 > r15 = 0x7f6156771df0 rip = 0x01b1e1b4 > Found by: call frame info > 7 impalad!impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*) > [hdfs-parquet-scanner.cc : 504 + 0xb] > rbx = 0x2c195800 rbp = 0x7f6156771ec0 > rsp = 0x7f6156771e50 r12 = 0xc1ca4d00 > r13 = 0x7f6156771e78 r14 = 0x7f6156771e80 > r15 = 0xaaab rip = 0x01b1ed5b > Found by: call frame info > 8
[jira] [Commented] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type
[ https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584 ] Qifan Chen commented on IMPALA-11665: - Setup a table with nulls and empty strings in the STRING columns. When loading, configured the table with 1 page and 3 pages. Ran the query in DML section below and observed the following when the fast code path is taken. 1. Nulls are not part of the page min/max stats and min/max filter stats at all, which is good; 2. The runtime filtering works as designed. DDL {code:java} create table null_pq ( id string, null_str string, null_int int ) sort by (null_str) stored as parquet ; {code} data loading: {code:java} set PARQUET_PAGE_ROW_COUNT_LIMIT=12; insert into null_pq values ('a', null, 1), ('b', null, 2), ('c',null,3), ('aa', 'a', 1), ('ab', 'b', 2), ('ac','c',3), ('ad', '', 4), ('ae', '', 5), ('ac','',6); {code} 1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12) {code:java} [14:11:06 qchen@qifan-10229: src] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq 22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY null_str: BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]... null_int: INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:9 BINARY id *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:ad value 2: R:0 D:1 V:ae value 3: R:0 D:1 V:ac value 4: R:0 D:1 V:aa value 5: R:0 D:1 V:ab value 6: R:0 D:1 V:ac value 7: R:0 D:1 V:a value 8: R:0 D:1 V:b value 9: R:0 D:1 V:c BINARY null_str *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V: value 2: R:0 D:1 V: value 3: R:0 D:1 V: value 4: R:0 D:1 V:a value 5: R:0 D:1 V:b value 6: R:0 D:1 V:c value 7: R:0 D:0 V: value 8: R:0 D:0 V: value 9: R:0 D:0 V: INT32 null_int *** row group 1 of 1, values 1 to 9 *** value 1: R:0 D:1 V:4 value 2: R:0 D:1 V:5 value 3: R:0 D:1 V:6 value 4: R:0 D:1 V:1 value 5: R:0 D:1 V:2 value 6: R:0 D:1 V:3 value 7: R:0 D:1 V:1 value 8: R:0 D:1 V:2 value 9: R:0 D:1 V:3 [14:23:16 qchen@qifan-10229: src] {code} 3 pages case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4) {code:java} pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq [13:50:22 qchen@qifan-10229: cluster] pqtools dump hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq 22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy] row group 0 id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]... null_str: BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]... null_int: INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]... id TV=9 RL=0 DL=1 DS: 8 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:1 null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY page 0: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 1: DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY [more]... VC:4 page 2: DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat [more]... VC:1 null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY page
[jira] [Closed] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan
[ https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-10758. --- Resolution: Not A Bug Verified that the plan difference does not show up in recent core s3 tests. The test passes. > S3PlannerTest.testNestedCollections fails because of mismatch plan > -- > > Key: IMPALA-10758 > URL: https://issues.apache.org/jira/browse/IMPALA-10758 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Yongzhi Chen >Assignee: Qifan Chen >Priority: Critical > > S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with > following error: > {noformat} > Error Message > Section PLAN of query: > select 1 > from tpch_nested_parquet.region.r_nations t1 > inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos > inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address > left join t2.c_orders t4 > inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey > left join t4.item.o_lineitems t6 on t6.item.l_returnflag = > t4.item.o_orderpriority > Actual does not match expected result: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey > > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN S3 [tpch_nested_parquet.region t5] > | | | S3 partitions=1/1 files=1 size=3.59KB > | | | row-size=2B cardinality=5 > | | | > | | 01:SCAN S3 [tpch_nested_parquet.customer t2] > | | S3 partitions=1/1 files=4 size=289.06MB > | | runtime filters: RF004 -> t2.c_custkey > | | row-size=59B cardinality=150.00K > | | > | 02:SCAN S3 [tpch_nested_parquet.region t3] > | S3 partitions=1/1 files=1 size=3.59KB > | runtime filters: RF002 -> t3.r_comment > | row-size=78B cardinality=5 > | > 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1] >S3 partitions=1/1 files=1 size=3.59KB >runtime filters: RF001 -> t1.pos, RF000 -> t1.pos >row-size=8B cardinality=50 > Expected: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN HDFS [tpch_nested_parquet.region t5] > | | | HDFS partitions=1/1 files=1 size=3.59KB > | | | row-size=2B cardinality=5 > | | | > | | 01:SCAN HDFS [tpch_nested_parquet.customer t2] > | | HDFS partitions=1/1 files=4 size=289.02MB > | | runtime filters: RF004 -> t2.c_custkey > | | row-size=59B
[jira] [Closed] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan
[ https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-10758. --- Resolution: Not A Bug Verified that the plan difference does not show up in recent core s3 tests. The test passes. > S3PlannerTest.testNestedCollections fails because of mismatch plan > -- > > Key: IMPALA-10758 > URL: https://issues.apache.org/jira/browse/IMPALA-10758 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Yongzhi Chen >Assignee: Qifan Chen >Priority: Critical > > S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with > following error: > {noformat} > Error Message > Section PLAN of query: > select 1 > from tpch_nested_parquet.region.r_nations t1 > inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos > inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address > left join t2.c_orders t4 > inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey > left join t4.item.o_lineitems t6 on t6.item.l_returnflag = > t4.item.o_orderpriority > Actual does not match expected result: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey > > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN S3 [tpch_nested_parquet.region t5] > | | | S3 partitions=1/1 files=1 size=3.59KB > | | | row-size=2B cardinality=5 > | | | > | | 01:SCAN S3 [tpch_nested_parquet.customer t2] > | | S3 partitions=1/1 files=4 size=289.06MB > | | runtime filters: RF004 -> t2.c_custkey > | | row-size=59B cardinality=150.00K > | | > | 02:SCAN S3 [tpch_nested_parquet.region t3] > | S3 partitions=1/1 files=1 size=3.59KB > | runtime filters: RF002 -> t3.r_comment > | row-size=78B cardinality=5 > | > 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1] >S3 partitions=1/1 files=1 size=3.59KB >runtime filters: RF001 -> t1.pos, RF000 -> t1.pos >row-size=8B cardinality=50 > Expected: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN HDFS [tpch_nested_parquet.region t5] > | | | HDFS partitions=1/1 files=1 size=3.59KB > | | | row-size=2B cardinality=5 > | | | > | | 01:SCAN HDFS [tpch_nested_parquet.customer t2] > | | HDFS partitions=1/1 files=4 size=289.02MB > | | runtime filters: RF004 -> t2.c_custkey > | | row-size=59B
[jira] [Commented] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan
[ https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617191#comment-17617191 ] Qifan Chen commented on IMPALA-10758: - Verified that the bug does not exist in recent core s3 tests. https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/226/testReport/org.apache.impala.planner/S3PlannerTest/testNestedCollections/ https://master-03.jenkins.cloudera.com/job/impala-cdwh-2022.0.10.1-core-s3/4/testReport/org.apache.impala.planner/S3PlannerTest/ > S3PlannerTest.testNestedCollections fails because of mismatch plan > -- > > Key: IMPALA-10758 > URL: https://issues.apache.org/jira/browse/IMPALA-10758 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Yongzhi Chen >Assignee: Qifan Chen >Priority: Critical > > S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with > following error: > {noformat} > Error Message > Section PLAN of query: > select 1 > from tpch_nested_parquet.region.r_nations t1 > inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos > inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address > left join t2.c_orders t4 > inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey > left join t4.item.o_lineitems t6 on t6.item.l_returnflag = > t4.item.o_orderpriority > Actual does not match expected result: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey > > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN S3 [tpch_nested_parquet.region t5] > | | | S3 partitions=1/1 files=1 size=3.59KB > | | | row-size=2B cardinality=5 > | | | > | | 01:SCAN S3 [tpch_nested_parquet.customer t2] > | | S3 partitions=1/1 files=4 size=289.06MB > | | runtime filters: RF004 -> t2.c_custkey > | | row-size=59B cardinality=150.00K > | | > | 02:SCAN S3 [tpch_nested_parquet.region t3] > | S3 partitions=1/1 files=1 size=3.59KB > | runtime filters: RF002 -> t3.r_comment > | row-size=78B cardinality=5 > | > 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1] >S3 partitions=1/1 files=1 size=3.59KB >runtime filters: RF001 -> t1.pos, RF000 -> t1.pos >row-size=8B cardinality=50 > Expected: > PLAN-ROOT SINK > | > 14:SUBPLAN > | row-size=183B cardinality=1 > | > |--12:SUBPLAN > | | row-size=183B cardinality=1 > | | > | |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | | join predicates: t6.item.l_returnflag = t4.item.o_orderpriority > | | | row-size=183B cardinality=10 > | | | > | | |--08:SINGULAR ROW SRC > | | | row-size=171B cardinality=1 > | | | > | | 09:UNNEST [t4.item.o_lineitems t6] > | | row-size=0B cardinality=10 > | | > | 11:NESTED LOOP JOIN [RIGHT OUTER JOIN] > | | row-size=171B cardinality=1 > | | > | |--06:SINGULAR ROW SRC > | | row-size=147B cardinality=1 > | | > | 07:UNNEST [t2.c_orders t4] > | row-size=0B cardinality=10 > | > 13:HASH JOIN [INNER JOIN] > | hash predicates: t1.pos = t2.c_nationkey > | runtime filters: RF000 <- t2.c_nationkey > | row-size=147B cardinality=1 > | > |--05:HASH JOIN [INNER JOIN] > | | hash predicates: t3.r_comment = t2.c_address > | | runtime filters: RF002 <- t2.c_address > | | row-size=139B cardinality=1 > | | > | |--04:HASH JOIN [INNER JOIN] > | | | hash predicates: t2.c_custkey = t5.r_regionkey > | | | runtime filters: RF004 <- t5.r_regionkey > | | | row-size=61B cardinality=5 > | | | > | | |--03:SCAN HDFS [tpch_nested_parquet.region t5] > | | | HDFS partitions=1/1 files=1
[jira] [Closed] (IMPALA-10292) Improvement to test_misaligned_parquet_row_groups section in query_test/test_scanners.py
[ https://issues.apache.org/jira/browse/IMPALA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-10292. --- Resolution: Won't Fix The "Won't Fix" is based on Joe's comment. > Improvement to test_misaligned_parquet_row_groups section in > query_test/test_scanners.py > > > Key: IMPALA-10292 > URL: https://issues.apache.org/jira/browse/IMPALA-10292 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Labels: broken-build, flaky > > In /impala-asf-master-exhaustive build, the following error is seen. > Error Details > {code:java} > query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups > self._misaligned_parquet_row_groups_helper(table_name, 7300) > query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper > assert len(num_scanners_with_no_reads_list) == 4 E assert 3 == 4 E+ > where 3 = len(['0', '0', '0']) > {code} > Stack Trace > {code:java} > query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups > self._misaligned_parquet_row_groups_helper(table_name, 7300) > query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper > assert len(num_scanners_with_no_reads_list) == 4 > E assert 3 == 4 > E+ where 3 = len(['0', '0', '0']) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-10292) Improvement to test_misaligned_parquet_row_groups section in query_test/test_scanners.py
[ https://issues.apache.org/jira/browse/IMPALA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-10292. --- Resolution: Won't Fix The "Won't Fix" is based on Joe's comment. > Improvement to test_misaligned_parquet_row_groups section in > query_test/test_scanners.py > > > Key: IMPALA-10292 > URL: https://issues.apache.org/jira/browse/IMPALA-10292 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Labels: broken-build, flaky > > In /impala-asf-master-exhaustive build, the following error is seen. > Error Details > {code:java} > query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups > self._misaligned_parquet_row_groups_helper(table_name, 7300) > query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper > assert len(num_scanners_with_no_reads_list) == 4 E assert 3 == 4 E+ > where 3 = len(['0', '0', '0']) > {code} > Stack Trace > {code:java} > query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups > self._misaligned_parquet_row_groups_helper(table_name, 7300) > query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper > assert len(num_scanners_with_no_reads_list) == 4 > E assert 3 == 4 > E+ where 3 = len(['0', '0', '0']) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (IMPALA-11604) Planner changes for CPU usage
[ https://issues.apache.org/jira/browse/IMPALA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-11604 started by Qifan Chen. --- > Planner changes for CPU usage > - > > Key: IMPALA-11604 > URL: https://issues.apache.org/jira/browse/IMPALA-11604 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > Plan scaling based on estimated peak memory has been enabled in > IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such > as the number of data processed) as a scaling factor. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10999) Flakiness in TestAsyncLoadData.test_async_load
[ https://issues.apache.org/jira/browse/IMPALA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10999. - Resolution: Fixed > Flakiness in TestAsyncLoadData.test_async_load > -- > > Key: IMPALA-10999 > URL: https://issues.apache.org/jira/browse/IMPALA-10999 > Project: IMPALA > Issue Type: Bug >Reporter: Bikramjeet Vig >Assignee: Qifan Chen >Priority: Major > Labels: broken-build, flaky-test > > This test failed in one of the GVO's recently. > [Link|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/] > > {noformat} > Error Message > metadata/test_load.py:197: in test_async_load assert(exec_end_state == > finished_state) E assert 3 == 4 > Stacktrace > metadata/test_load.py:197: in test_async_load > assert(exec_end_state == finished_state) > E assert 3 == 4 > Standard Error > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE; > -- 2021-10-30 01:38:55,281 INFO MainThread: Started query > df43a0ff6165a9eb:33b0d69f > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_async_load_ff1c20a7`; > -- 2021-10-30 01:39:01,148 INFO MainThread: Started query > e64bd28a97339b44:e76523a8 > -- 2021-10-30 01:39:01,253 INFO MainThread: Created database > "test_async_load_ff1c20a7" for test ID > "metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution: > False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > text/none]" > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- connecting to: localhost:21000 > -- executing against localhost:21000 > create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like > functional.alltypesnopart location > '/test-warehouse/test_load_staging_beeswax_False'; > -- 2021-10-30 01:39:09,435 INFO MainThread: Started query > e543635533874c9e:fe238ca9 > -- executing against localhost:21000 > select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False; > -- 2021-10-30 01:39:13,178 INFO MainThread: Started query > 5c4969e81b1b614b:26754a22 > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- executing against localhost:21000 > use functional; > -- 2021-10-30 01:39:13,413 INFO MainThread: Started query > d340e3650cba2d6f:a35a14bb > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET batch_size=0; >
[jira] [Resolved] (IMPALA-10999) Flakiness in TestAsyncLoadData.test_async_load
[ https://issues.apache.org/jira/browse/IMPALA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10999. - Resolution: Fixed > Flakiness in TestAsyncLoadData.test_async_load > -- > > Key: IMPALA-10999 > URL: https://issues.apache.org/jira/browse/IMPALA-10999 > Project: IMPALA > Issue Type: Bug >Reporter: Bikramjeet Vig >Assignee: Qifan Chen >Priority: Major > Labels: broken-build, flaky-test > > This test failed in one of the GVO's recently. > [Link|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/] > > {noformat} > Error Message > metadata/test_load.py:197: in test_async_load assert(exec_end_state == > finished_state) E assert 3 == 4 > Stacktrace > metadata/test_load.py:197: in test_async_load > assert(exec_end_state == finished_state) > E assert 3 == 4 > Standard Error > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE; > -- 2021-10-30 01:38:55,281 INFO MainThread: Started query > df43a0ff6165a9eb:33b0d69f > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_async_load_ff1c20a7`; > -- 2021-10-30 01:39:01,148 INFO MainThread: Started query > e64bd28a97339b44:e76523a8 > -- 2021-10-30 01:39:01,253 INFO MainThread: Created database > "test_async_load_ff1c20a7" for test ID > "metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution: > False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > text/none]" > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- connecting to: localhost:21000 > -- executing against localhost:21000 > create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like > functional.alltypesnopart location > '/test-warehouse/test_load_staging_beeswax_False'; > -- 2021-10-30 01:39:09,435 INFO MainThread: Started query > e543635533874c9e:fe238ca9 > -- executing against localhost:21000 > select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False; > -- 2021-10-30 01:39:13,178 INFO MainThread: Started query > 5c4969e81b1b614b:26754a22 > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > -- executing against localhost:21000 > use functional; > -- 2021-10-30 01:39:13,413 INFO MainThread: Started query > d340e3650cba2d6f:a35a14bb > SET > client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node; > SET batch_size=0; >
[jira] [Resolved] (IMPALA-11573) Certain methods used by the replan feature can be improved
[ https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11573. - Fix Version/s: Impala 4.1.1 Resolution: Fixed > Certain methods used by the replan feature can be improved > --- > > Key: IMPALA-11573 > URL: https://issues.apache.org/jira/browse/IMPALA-11573 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.1.1 > > > Certain methods for replanning (IMPALA-10992) are not suitable to be called > from Hive. For example setupThresholdsForExecutorGroupSets() and > canStmtBeAutoScaled() in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11573) Certain methods used by the replan feature can be improved
[ https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11573. - Fix Version/s: Impala 4.1.1 Resolution: Fixed > Certain methods used by the replan feature can be improved > --- > > Key: IMPALA-11573 > URL: https://issues.apache.org/jira/browse/IMPALA-11573 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.1.1 > > > Certain methods for replanning (IMPALA-10992) are not suitable to be called > from Hive. For example setupThresholdsForExecutorGroupSets() and > canStmtBeAutoScaled() in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run
[ https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10715. - Resolution: Fixed Disabled the bloom filters for the entire decimal min/max filter tests. The tests are looking for the impact of min/max filters which should not be interfered by the blooms. > test_decimal_min_max_filters failed in exhaustive run > - > > Key: IMPALA-10715 > URL: https://issues.apache.org/jira/browse/IMPALA-10715 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > > test_decimal_min_max_filters failed in exhaustive run > *Stack Trace* > {noformat} > query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:775: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over ProbeRows did not match expected > results. > E EXPECTED VALUE: > E 102 > E > E > E ACTUAL VALUE: > E 38 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run
[ https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10715. - Resolution: Fixed Disabled the bloom filters for the entire decimal min/max filter tests. The tests are looking for the impact of min/max filters which should not be interfered by the blooms. > test_decimal_min_max_filters failed in exhaustive run > - > > Key: IMPALA-10715 > URL: https://issues.apache.org/jira/browse/IMPALA-10715 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > > test_decimal_min_max_filters failed in exhaustive run > *Stack Trace* > {noformat} > query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:775: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over ProbeRows did not match expected > results. > E EXPECTED VALUE: > E 102 > E > E > E ACTUAL VALUE: > E 38 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run
[ https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10715 started by Qifan Chen. --- > test_decimal_min_max_filters failed in exhaustive run > - > > Key: IMPALA-10715 > URL: https://issues.apache.org/jira/browse/IMPALA-10715 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > > test_decimal_min_max_filters failed in exhaustive run > *Stack Trace* > {noformat} > query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:775: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over ProbeRows did not match expected > results. > E EXPECTED VALUE: > E 102 > E > E > E ACTUAL VALUE: > E 38 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match the expected
[ https://issues.apache.org/jira/browse/IMPALA-11652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-11652: Summary: In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match the expected (was: In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected) > In org.apache.impala.planner.PlannerTest.testHbase, the selected range does > not match the expected > -- > > Key: IMPALA-11652 > URL: https://issues.apache.org/jira/browse/IMPALA-11652 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Qifan Chen >Priority: Major > > org.apache.impala.planner.PlannerTest.testHbase > Error Message > {code:java} > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id < '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE 1:3 > > HBASE KEYRANGE 3:5 > HBASE KEYRANGE :1 > NODE 0: > Expected: > HBASE KEYRANGE 3:5 > HBASE KEYRANGE :3 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.alltypesagg > where bigint_col is not null and bool_col = true > Actual does not match expected result: > HBASE KEYRANGE 1:3 > > HBASE KEYRANGE 3:7 > HBASE KEYRANGE 7: > HBASE KEYRANGE :1 > NODE 0: > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected
Qifan Chen created IMPALA-11652: --- Summary: In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected Key: IMPALA-11652 URL: https://issues.apache.org/jira/browse/IMPALA-11652 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen org.apache.impala.planner.PlannerTest.testHbase Error Message {code:java} section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id < '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 1:3 HBASE KEYRANGE 3:5 HBASE KEYRANGE :1 NODE 0: Expected: HBASE KEYRANGE 3:5 HBASE KEYRANGE :3 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.alltypesagg where bigint_col is not null and bool_col = true Actual does not match expected result: HBASE KEYRANGE 1:3 HBASE KEYRANGE 3:7 HBASE KEYRANGE 7: HBASE KEYRANGE :1 NODE 0: {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected
Qifan Chen created IMPALA-11652: --- Summary: In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected Key: IMPALA-11652 URL: https://issues.apache.org/jira/browse/IMPALA-11652 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen org.apache.impala.planner.PlannerTest.testHbase Error Message {code:java} section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id < '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 1:3 HBASE KEYRANGE 3:5 HBASE KEYRANGE :1 NODE 0: Expected: HBASE KEYRANGE 3:5 HBASE KEYRANGE :3 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.alltypesagg where bigint_col is not null and bool_col = true Actual does not match expected result: HBASE KEYRANGE 1:3 HBASE KEYRANGE 3:7 HBASE KEYRANGE 7: HBASE KEYRANGE :1 NODE 0: {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11651) Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
Qifan Chen created IMPALA-11651: --- Summary: Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression Key: IMPALA-11651 URL: https://issues.apache.org/jira/browse/IMPALA-11651 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen In impala-asf-master-core-s3-data-cache : #217 : metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression[protocol: beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest) {code:java} metadata/test_partition_metadata.py:214: in test_unsupported_text_compression assert len(show_files_result.data) == 5, "Expected one file per partition dir" E AssertionError: Expected one file per partition dir E assert 2 == 5 E+ where 2 = len(['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1', 's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2']) E+where ['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1', 's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2'] = .data {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11651) Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
Qifan Chen created IMPALA-11651: --- Summary: Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression Key: IMPALA-11651 URL: https://issues.apache.org/jira/browse/IMPALA-11651 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen In impala-asf-master-core-s3-data-cache : #217 : metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression[protocol: beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest) {code:java} metadata/test_partition_metadata.py:214: in test_unsupported_text_compression assert len(show_files_result.data) == 5, "Expected one file per partition dir" E AssertionError: Expected one file per partition dir E assert 2 == 5 E+ where 2 = len(['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1', 's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2']) E+where ['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1', 's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2'] = .data {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IMPALA-11650) Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure
Qifan Chen created IMPALA-11650: --- Summary: Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure Key: IMPALA-11650 URL: https://issues.apache.org/jira/browse/IMPALA-11650 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen In impala-asf-master-core-s3-data-cache #217. : custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure (from pytest) {code:java} ustom_cluster/test_query_retries.py:276: in test_retry_exec_rpc_failure self.__assert_executors_blacklisted(killed_impalad, retried_runtime_profile) custom_cluster/test_query_retries.py:1091: in __assert_executors_blacklisted assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname, 1088 def __assert_executors_blacklisted(self, blacklisted_impalad, profile): 1089 """Validate that the given profile indicates that the given impalad was blacklisted 1090 during query execution.""" 1091 assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname, 1092 blacklisted_impalad.service.krpc_port) in profile, profile {code} This is the link to the test: https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3-data-cache/217/testReport/junit/custom_cluster.test_query_retries/TestQueryRetries/test_retry_exec_rpc_failure/ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11650) Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure
Qifan Chen created IMPALA-11650: --- Summary: Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure Key: IMPALA-11650 URL: https://issues.apache.org/jira/browse/IMPALA-11650 Project: IMPALA Issue Type: Bug Components: Distributed Exec Reporter: Qifan Chen In impala-asf-master-core-s3-data-cache #217. : custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure (from pytest) {code:java} ustom_cluster/test_query_retries.py:276: in test_retry_exec_rpc_failure self.__assert_executors_blacklisted(killed_impalad, retried_runtime_profile) custom_cluster/test_query_retries.py:1091: in __assert_executors_blacklisted assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname, 1088 def __assert_executors_blacklisted(self, blacklisted_impalad, profile): 1089 """Validate that the given profile indicates that the given impalad was blacklisted 1090 during query execution.""" 1091 assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname, 1092 blacklisted_impalad.service.krpc_port) in profile, profile {code} This is the link to the test: https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3-data-cache/217/testReport/junit/custom_cluster.test_query_retries/TestQueryRetries/test_retry_exec_rpc_failure/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IMPALA-11649) Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader
Qifan Chen created IMPALA-11649: --- Summary: Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader Key: IMPALA-11649 URL: https://issues.apache.org/jira/browse/IMPALA-11649 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/225/ Failed generate_junitxml.buildall.create-load-data (from generate_junitxml.buildall.create-load-data) Failing for the past 1 build (Since #225 ) Took 0 ms. Error Message Error in /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/testdata/bin/create-load-data.sh at line 95: -timeout) SQL {code:java} 06:06:05 ERROR: INSERT into TABLE functional_kudu.alltypes 06:06:05 SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, 06:06:05timestamp_col, year, month 06:06:05 FROM functional.alltypes {code} SQL error: {code:java} 06:06:05 ImpalaBeeswaxException: ImpalaBeeswaxException: 06:06:05 INNER EXCEPTION: 06:06:05 MESSAGE: AnalysisException: Failed to load metadata for table: ‘functional.alltypes' 06:06:05 CAUSED BY: TableLoadingException: Loading file and block metadata for 24 paths for table functional.alltypes: failed to load 2 paths. Check the catalog server log for more details. {code} Catalog server log: {code:java} Log file created at: 2022/10/09 04:03:47 Running on machine: impala-ec2-centos79-m6i-4xlarge-ondemand-005a.vpc.cloudera.com Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg E1009 04:03:47.665833 14343 logging.cc:248] stderr will be logged to this file. 22/10/09 04:03:48 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 22/10/09 04:03:48 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 22/10/09 04:03:48 INFO impl.MetricsSystemImpl: s3a-file-system metrics system started 22/10/09 04:03:49 INFO Configuration.deprecation: No unit for fs.s3a.connection.request.timeout(0) assuming SECONDS 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics system... 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system stopped. 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system shutdown complete. 22/10/09 04:03:49 INFO util.JvmPauseMonitor: Starting JVM pause monitor E1009 04:06:05.058591 19943 ParallelFileMetadataLoader.java:171] Loading file and block metadata for 24 paths for table functional.alltypes encountered an error loading data for path s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2010/month=4 Java exception follows: {code} {code:java} java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:168) at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120) at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:781) at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:744) at org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:719) at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1268) at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1162) at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144) at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245) at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.(Listing.java:621) at org.apache.hadoop.fs.s3a.Listing.createObjectListingIterator(Listing.java:163) at org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:144) at org.apache.hadoop.fs.s3a.Listing.getListFilesAssumingDir(Listing.java:212) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListFiles(S3AFileSystem.java:4790) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listFiles$37(S3AFileSystem.java:4732) at
[jira] [Created] (IMPALA-11649) Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader
Qifan Chen created IMPALA-11649: --- Summary: Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader Key: IMPALA-11649 URL: https://issues.apache.org/jira/browse/IMPALA-11649 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/225/ Failed generate_junitxml.buildall.create-load-data (from generate_junitxml.buildall.create-load-data) Failing for the past 1 build (Since #225 ) Took 0 ms. Error Message Error in /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/testdata/bin/create-load-data.sh at line 95: -timeout) SQL {code:java} 06:06:05 ERROR: INSERT into TABLE functional_kudu.alltypes 06:06:05 SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, 06:06:05timestamp_col, year, month 06:06:05 FROM functional.alltypes {code} SQL error: {code:java} 06:06:05 ImpalaBeeswaxException: ImpalaBeeswaxException: 06:06:05 INNER EXCEPTION: 06:06:05 MESSAGE: AnalysisException: Failed to load metadata for table: ‘functional.alltypes' 06:06:05 CAUSED BY: TableLoadingException: Loading file and block metadata for 24 paths for table functional.alltypes: failed to load 2 paths. Check the catalog server log for more details. {code} Catalog server log: {code:java} Log file created at: 2022/10/09 04:03:47 Running on machine: impala-ec2-centos79-m6i-4xlarge-ondemand-005a.vpc.cloudera.com Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg E1009 04:03:47.665833 14343 logging.cc:248] stderr will be logged to this file. 22/10/09 04:03:48 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 22/10/09 04:03:48 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 22/10/09 04:03:48 INFO impl.MetricsSystemImpl: s3a-file-system metrics system started 22/10/09 04:03:49 INFO Configuration.deprecation: No unit for fs.s3a.connection.request.timeout(0) assuming SECONDS 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics system... 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system stopped. 22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system shutdown complete. 22/10/09 04:03:49 INFO util.JvmPauseMonitor: Starting JVM pause monitor E1009 04:06:05.058591 19943 ParallelFileMetadataLoader.java:171] Loading file and block metadata for 24 paths for table functional.alltypes encountered an error loading data for path s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2010/month=4 Java exception follows: {code} {code:java} java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:168) at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120) at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:781) at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:744) at org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:719) at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1268) at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1162) at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144) at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245) at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.(Listing.java:621) at org.apache.hadoop.fs.s3a.Listing.createObjectListingIterator(Listing.java:163) at org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:144) at org.apache.hadoop.fs.s3a.Listing.getListFilesAssumingDir(Listing.java:212) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListFiles(S3AFileSystem.java:4790) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listFiles$37(S3AFileSystem.java:4732) at
[jira] [Commented] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan
[ https://issues.apache.org/jira/browse/IMPALA-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615283#comment-17615283 ] Qifan Chen commented on IMPALA-11647: - The output width from the scan being 0B instead of 8B is due to this line of code: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/ScanNode.java#L160. Once the restriction is relaxed, we can get a better plan, where the row size is 8B and the # of rows is the # of files in the table. > Row size for source tables in a cross join query is set to 0 in query plan > -- > > Key: IMPALA-11647 > URL: https://issues.apache.org/jira/browse/IMPALA-11647 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Reporter: Qifan Chen >Priority: Major > > The row-size in the following explain output for both source tables is set to > 0B. On paper, it is possible to apply the count star optimization for such > queries and therefore set the row-size correctly. > {code:java} > explain select count(*) from store_sales a, store_sales b limit 500 > +--+ > | Explain String | > +--+ > | Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 | > | Per-Host Resource Estimates: Memory=10MB | > | | > | PLAN-ROOT SINK | > | || > | 06:AGGREGATE [FINALIZE] | > | | output: count:merge(*)| > | | limit: 500| > | | row-size=8B cardinality=1 | > | || > | 05:EXCHANGE [UNPARTITIONED] | > | || > | 03:AGGREGATE | > | | output: count(*) | > | | row-size=8B cardinality=1 | > | || > | 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] | > | | row-size=0B cardinality=8.30T | > | || > | |--04:EXCHANGE [BROADCAST] | > | | | | > | | 01:SCAN HDFS [tpcds_parquet.store_sales b]| > | | HDFS partitions=1824/1824 files=1824 size=199.83MB | > | | row-size=0B cardinality=2.88M | > | || > | 00:SCAN HDFS [tpcds_parquet.store_sales a] | > |HDFS partitions=1824/1824 files=1824 size=199.83MB| > |row-size=0B cardinality=2.88M | > +--+ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan
Qifan Chen created IMPALA-11647: --- Summary: Row size for source tables in a cross join query is set to 0 in query plan Key: IMPALA-11647 URL: https://issues.apache.org/jira/browse/IMPALA-11647 Project: IMPALA Issue Type: Improvement Components: Frontend Reporter: Qifan Chen The row-size in the following explain output for both source tables is set to 0B. On paper, it is possible to apply the count star optimization for such queries and therefore set the row-size correctly. {code:java} explain select count(*) from store_sales a, store_sales b limit 500 +--+ | Explain String | +--+ | Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 | | Per-Host Resource Estimates: Memory=10MB | | | | PLAN-ROOT SINK | | || | 06:AGGREGATE [FINALIZE] | | | output: count:merge(*)| | | limit: 500| | | row-size=8B cardinality=1 | | || | 05:EXCHANGE [UNPARTITIONED] | | || | 03:AGGREGATE | | | output: count(*) | | | row-size=8B cardinality=1 | | || | 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] | | | row-size=0B cardinality=8.30T | | || | |--04:EXCHANGE [BROADCAST] | | | | | | | 01:SCAN HDFS [tpcds_parquet.store_sales b]| | | HDFS partitions=1824/1824 files=1824 size=199.83MB | | | row-size=0B cardinality=2.88M | | || | 00:SCAN HDFS [tpcds_parquet.store_sales a] | |HDFS partitions=1824/1824 files=1824 size=199.83MB| |row-size=0B cardinality=2.88M | +--+ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan
Qifan Chen created IMPALA-11647: --- Summary: Row size for source tables in a cross join query is set to 0 in query plan Key: IMPALA-11647 URL: https://issues.apache.org/jira/browse/IMPALA-11647 Project: IMPALA Issue Type: Improvement Components: Frontend Reporter: Qifan Chen The row-size in the following explain output for both source tables is set to 0B. On paper, it is possible to apply the count star optimization for such queries and therefore set the row-size correctly. {code:java} explain select count(*) from store_sales a, store_sales b limit 500 +--+ | Explain String | +--+ | Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 | | Per-Host Resource Estimates: Memory=10MB | | | | PLAN-ROOT SINK | | || | 06:AGGREGATE [FINALIZE] | | | output: count:merge(*)| | | limit: 500| | | row-size=8B cardinality=1 | | || | 05:EXCHANGE [UNPARTITIONED] | | || | 03:AGGREGATE | | | output: count(*) | | | row-size=8B cardinality=1 | | || | 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] | | | row-size=0B cardinality=8.30T | | || | |--04:EXCHANGE [BROADCAST] | | | | | | | 01:SCAN HDFS [tpcds_parquet.store_sales b]| | | HDFS partitions=1824/1824 files=1824 size=199.83MB | | | row-size=0B cardinality=2.88M | | || | 00:SCAN HDFS [tpcds_parquet.store_sales a] | |HDFS partitions=1824/1824 files=1824 size=199.83MB| |row-size=0B cardinality=2.88M | +--+ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IMPALA-11617) Pool service should be made aware of cpu-usage limit for each executor group set
Qifan Chen created IMPALA-11617: --- Summary: Pool service should be made aware of cpu-usage limit for each executor group set Key: IMPALA-11617 URL: https://issues.apache.org/jira/browse/IMPALA-11617 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen IMPALA-11604 enables the planner to compute CPU usage for certain queries and to select suitable executor groups to run. Here the CPU usage is expressed as the total amount of data to be processed per instance. The limit on the total amount of data that each executor group can handle should be provided by the pool service. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11617) Pool service should be made aware of cpu-usage limit for each executor group set
Qifan Chen created IMPALA-11617: --- Summary: Pool service should be made aware of cpu-usage limit for each executor group set Key: IMPALA-11617 URL: https://issues.apache.org/jira/browse/IMPALA-11617 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen IMPALA-11604 enables the planner to compute CPU usage for certain queries and to select suitable executor groups to run. Here the CPU usage is expressed as the total amount of data to be processed per instance. The limit on the total amount of data that each executor group can handle should be provided by the pool service. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IMPALA-11604) Planner changes for CPU usage
[ https://issues.apache.org/jira/browse/IMPALA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-11604: --- Assignee: Qifan Chen > Planner changes for CPU usage > - > > Key: IMPALA-11604 > URL: https://issues.apache.org/jira/browse/IMPALA-11604 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > Plan scaling based on estimated peak memory has been enabled in > IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such > as the number of data processed) as a scaling factor. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11604) Planner changes for CPU usage
Qifan Chen created IMPALA-11604: --- Summary: Planner changes for CPU usage Key: IMPALA-11604 URL: https://issues.apache.org/jira/browse/IMPALA-11604 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Plan scaling based on estimated peak memory has been enabled in IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such as the number of data processed) as a scaling factor. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11604) Planner changes for CPU usage
Qifan Chen created IMPALA-11604: --- Summary: Planner changes for CPU usage Key: IMPALA-11604 URL: https://issues.apache.org/jira/browse/IMPALA-11604 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Plan scaling based on estimated peak memory has been enabled in IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such as the number of data processed) as a scaling factor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IMPALA-11573) Certain methods used by the replan feature can be improved
[ https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-11573: Description: Certain methods for replanning (IMPALA-10992) are not suitable to be called from Hive. For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in Frontend.java are not static. (was: Certain methods for auto-scaling are not suitable to be called from Hive. For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in Frontend.java are not static. ) Summary: Certain methods used by the replan feature can be improved (was: Certain methods used by the auto-scaling feature can be improved) > Certain methods used by the replan feature can be improved > --- > > Key: IMPALA-11573 > URL: https://issues.apache.org/jira/browse/IMPALA-11573 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > Certain methods for replanning (IMPALA-10992) are not suitable to be called > from Hive. For example setupThresholdsForExecutorGroupSets() and > canStmtBeAutoScaled() in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved
Qifan Chen created IMPALA-11573: --- Summary: Certain methods used by the auto-scaling feature can be improved Key: IMPALA-11573 URL: https://issues.apache.org/jira/browse/IMPALA-11573 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Certain methods for auto-scaling are not suitable to be called from Hive. For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved
[ https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-11573: --- Assignee: Qifan Chen > Certain methods used by the auto-scaling feature can be improved > - > > Key: IMPALA-11573 > URL: https://issues.apache.org/jira/browse/IMPALA-11573 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > Certain methods for auto-scaling are not suitable to be called from Hive. > For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() > in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved
Qifan Chen created IMPALA-11573: --- Summary: Certain methods used by the auto-scaling feature can be improved Key: IMPALA-11573 URL: https://issues.apache.org/jira/browse/IMPALA-11573 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Certain methods for auto-scaling are not suitable to be called from Hive. For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in Frontend.java are not static. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
[ https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528959#comment-17528959 ] Qifan Chen commented on IMPALA-11274: - The DDL for the above query. {code:java} create table if not exists a1 ( c1 string ) STORED AS PARQUET; create table if not exists a4 ( customerkey string ) STORED AS PARQUET; create table if not exists a5 ( customerkey3024 string ) STORED AS PARQUET ; drop table if exists p; create table if not exists p ( client5171 string, clientsms5171 string, email1dc string, email2dc string, email5153 string, email5170 string, email5171 string, global5170 string, sms3dc string, sms5171 string, system5171 string, systemsms5171 string, systemsms string ) STORED AS PARQUET; {code} > CNF Rewrite causes a regress in join node performance > - > > Key: IMPALA-11274 > URL: https://issues.apache.org/jira/browse/IMPALA-11274 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > It appears that cnf rewrite can generate more predicates and presumably cause > the same query to execute slower. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
[ https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949 ] Qifan Chen edited comment on IMPALA-11274 at 4/27/22 6:22 PM: -- For the following query {code:java} set explain_level = 3; explain select * from p, a1, a4, a5 where ( ( coalesce(CAST(a1.c1 AS string), '') != '' ) OR ( ( ( upper(p.email5153) = '1' ) OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) OR ( ( upper(p.email5171) LIKE '%GMAI.COM' ) AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) OR ( upper(p.email5171) = 'CLARIANT.COM' ) OR ( upper(p.email5171) = 'YAHOO.COM' ) OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' ) ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' ) OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) ) AND ( upper(coalesce(p.client5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.email2dc AS string), '') = '' ) AND ( coalesce(CAST(p.email1dc AS string), '') = '' ) AND ( upper(coalesce(p.system5171, 'G')) = 'G' ) AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) ) OR ( upper(p.email5153) = '4' ) OR ( ( upper(p.email5153) = '3' ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' )
[jira] [Comment Edited] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
[ https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949 ] Qifan Chen edited comment on IMPALA-11274 at 4/27/22 6:21 PM: -- For the following query {code:java} set explain_level = 3; explain select * from p, a1, a4, a5 where ( ( coalesce(CAST(a1.c1 AS string), '') != '' ) OR ( ( ( upper(p.email5153) = '1' ) OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) OR ( ( upper(p.email5171) LIKE '%GMAI.COM' ) AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) OR ( upper(p.email5171) = 'CLARIANT.COM' ) OR ( upper(p.email5171) = 'YAHOO.COM' ) OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' ) ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' ) OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) ) AND ( upper(coalesce(p.client5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.email2dc AS string), '') = '' ) AND ( coalesce(CAST(p.email1dc AS string), '') = '' ) AND ( upper(coalesce(p.system5171, 'G')) = 'G' ) AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) ) OR ( upper(p.email5153) = '4' ) OR ( ( upper(p.email5153) = '3' ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' )
[jira] [Commented] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
[ https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949 ] Qifan Chen commented on IMPALA-11274: - For the following query {code:java} set explain_level = 3; explain select * from p, a1, a4, a5 where ( ( coalesce(CAST(a1.c1 AS string), '') != '' ) OR ( ( ( upper(p.email5153) = '1' ) OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) OR ( ( upper(p.email5171) LIKE '%GMAI.COM' ) AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) OR ( upper(p.email5171) = 'CLARIANT.COM' ) OR ( upper(p.email5171) = 'YAHOO.COM' ) OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' ) ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' ) OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) ) AND ( upper(coalesce(p.client5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.email2dc AS string), '') = '' ) AND ( coalesce(CAST(p.email1dc AS string), '') = '' ) AND ( upper(coalesce(p.system5171, 'G')) = 'G' ) AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) ) OR ( upper(p.email5153) = '4' ) OR ( ( upper(p.email5153) = '3' ) AND ( ( upper(p.global5170) != 'Y' ) OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) ) AND ( ( upper(p.email5170) != 'Y' ) OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) AND ( ( upper(p.sms5171) != 'Y' )
[jira] [Assigned] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
[ https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-11274: --- Assignee: Qifan Chen > CNF Rewrite causes a regress in join node performance > - > > Key: IMPALA-11274 > URL: https://issues.apache.org/jira/browse/IMPALA-11274 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > It appears that cnf rewrite can generate more predicates and presumably cause > the same query to execute slower. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
Qifan Chen created IMPALA-11274: --- Summary: CNF Rewrite causes a regress in join node performance Key: IMPALA-11274 URL: https://issues.apache.org/jira/browse/IMPALA-11274 Project: IMPALA Issue Type: Bug Components: Frontend Reporter: Qifan Chen It appears that cnf rewrite can generate more predicates and presumably cause the same query to execute slower. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11274) CNF Rewrite causes a regress in join node performance
Qifan Chen created IMPALA-11274: --- Summary: CNF Rewrite causes a regress in join node performance Key: IMPALA-11274 URL: https://issues.apache.org/jira/browse/IMPALA-11274 Project: IMPALA Issue Type: Bug Components: Frontend Reporter: Qifan Chen It appears that cnf rewrite can generate more predicates and presumably cause the same query to execute slower. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (IMPALA-10992) Planner changes for estimate peak memory.
[ https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10992. - Fix Version/s: Impala 4.1.0 Resolution: Fixed > Planner changes for estimate peak memory. > - > > Key: IMPALA-10992 > URL: https://issues.apache.org/jira/browse/IMPALA-10992 > Project: IMPALA > Issue Type: Task >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Critical > Fix For: Impala 4.1.0 > > > For ability to run large queries on larger executor group mapping to > different resource group, we would need to identify the large queries during > compile time. For this identification in first phase we can use peak memory > estimation to classify large queries. This Jira is to keep track of that > support. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10992) Planner changes for estimate peak memory.
[ https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10992. - Fix Version/s: Impala 4.1.0 Resolution: Fixed > Planner changes for estimate peak memory. > - > > Key: IMPALA-10992 > URL: https://issues.apache.org/jira/browse/IMPALA-10992 > Project: IMPALA > Issue Type: Task >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Critical > Fix For: Impala 4.1.0 > > > For ability to run large queries on larger executor group mapping to > different resource group, we would need to identify the large queries during > compile time. For this identification in first phase we can use peak memory > estimation to classify large queries. This Jira is to keep track of that > support. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (IMPALA-11189) Concurrent insert ACID tests are broken in local catalog mode
[ https://issues.apache.org/jira/browse/IMPALA-11189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-11189: Description: Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) can fail repeatedly in local catalog mode. In this case, the concurrent checker query (select * from ) returns duplicated rows such as reported below, where row [0,2] is duplicated. The failure can be reproduced quite easily by running the test (i.e., TestConcurrentAcidInserts) first, via commenting out all the tests prior to it in the test file tests/stress/test_acid_stress.py. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: {code:java} 09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 2 items tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0] FAILED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] PASSED short test summary info = FAIL tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0] FAILURES __ TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] ___ tests/stress/test_acid_stress.py:307: in test_concurrent_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E AssertionError: wid: 2 E assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4] E At index 3 diff: 2 != 3 E Left contains more items, first extra item: 4 E Full diff: E - [0, 1, 2, 2, 3, 4] E ? --- E + [0, 1, 2, 3, 4] - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE; -- 2022-03-16 09:20:54,808 INFO MainThread: Started query 28457f4c7e77cdec:c6d37319 SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_concurrent_inserts_8933345c`; -- 2022-03-16 09:20:54,877 INFO MainThread: Started query 374bf99aea680523:48d24054 -- 2022-03-16 09:21:01,164 INFO MainThread: Created database "test_concurrent_inserts_8933345c" for test ID "stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts; -- 2022-03-16 09:21:01,173 INFO MainThread: Started query 20480c2a1d336d35:c2d84edd -- executing against localhost:21000 create table test_concurrent_inserts_8933345c.test_concurrent_inserts (wid int, i int) TBLPROPERTIES ( 'transactional_properties' = 'insert_only', 'transactional' = 'true')
[jira] [Closed] (IMPALA-11190) Test failing insert tests are broken in local catalog mode
[ https://issues.apache.org/jira/browse/IMPALA-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-11190. --- Resolution: Duplicate This is a duplication of case 11191. > Test failing insert tests are broken in local catalog mode > -- > > Key: IMPALA-11190 > URL: https://issues.apache.org/jira/browse/IMPALA-11190 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Reporter: Qifan Chen >Priority: Major > > Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail > repeatedly in local catalog mode. The concurrent checker query (select * from > ) can return rows of value -1 which should not happen. > This is reproducible by commenting out test TestFailingAcidInserts in > tests/stress/test_acid_stress.py. > Setup: > 1. Build the impala and clear HMS in case in a bad state: > $IMPALA_HOME/buildall.sh -format_metastore -notests > 2. Start the cluster in local catalog mode: > $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args > --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal > --catalogd_args --hms_event_polling_interval_s=1 > 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test > $IMPALA_TESTS/stress/test_acid_stress.py > Error reported: > Main branch failed on test est_failing_inserts > [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress > rootLoggerLevel = INFO > == test session starts > === > platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- > /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python > cachedir: tests/.cache > rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini > plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 > timeout: 7200s method: signal > collected 4 items > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts > PASSED > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts > PASSED > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] > PASSED > tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] > FAILED > short test summary info > = > FAIL > tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] > FAILURES > > _ > TestFailingAcidInserts.test_failing_inserts[unique_database0] > __ > tests/stress/test_acid_stress.py:387: in test_failing_inserts > self._run_test_failing_inserts(unique_database, is_partitioned) > tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts > run_tasks(writers + checkers) > tests/stress/stress_util.py:45: in run_tasks > pool.map_async(Task.run, tasks).get(timeout_seconds) > ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: > in get > raise self._value > E assert 1 == 0 > E+ where 1 = len(['-1']) > E+where ['-1'] = object at 0x7f62144b6890>.data > - Captured stderr setup > -- > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > SET sync_ddl=True; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; > -- 2022-03-16 12:03:05,084 INFO MainThread: Started query > ee4cd7bba1374e44:5133f3bb > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > SET sync_ddl=True; > -- executing against localhost:21000 > CREATE DATABASE `test_failing_inserts_b980fc6`; > -- 2022-03-16 12:03:05,179 INFO MainThread: Started query > d949fdad1d1e9d19:430dd1ca > -- 2022-03-16 12:03:11,071 INFO MainThread: Created database > "test_failing_inserts_b980fc6" for test ID >
[jira] [Closed] (IMPALA-11190) Test failing insert tests are broken in local catalog mode
[ https://issues.apache.org/jira/browse/IMPALA-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen closed IMPALA-11190. --- Resolution: Duplicate This is a duplication of case 11191. > Test failing insert tests are broken in local catalog mode > -- > > Key: IMPALA-11190 > URL: https://issues.apache.org/jira/browse/IMPALA-11190 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Reporter: Qifan Chen >Priority: Major > > Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail > repeatedly in local catalog mode. The concurrent checker query (select * from > ) can return rows of value -1 which should not happen. > This is reproducible by commenting out test TestFailingAcidInserts in > tests/stress/test_acid_stress.py. > Setup: > 1. Build the impala and clear HMS in case in a bad state: > $IMPALA_HOME/buildall.sh -format_metastore -notests > 2. Start the cluster in local catalog mode: > $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args > --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal > --catalogd_args --hms_event_polling_interval_s=1 > 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test > $IMPALA_TESTS/stress/test_acid_stress.py > Error reported: > Main branch failed on test est_failing_inserts > [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress > rootLoggerLevel = INFO > == test session starts > === > platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- > /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python > cachedir: tests/.cache > rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini > plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 > timeout: 7200s method: signal > collected 4 items > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts > PASSED > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts > PASSED > tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] > PASSED > tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] > FAILED > short test summary info > = > FAIL > tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] > FAILURES > > _ > TestFailingAcidInserts.test_failing_inserts[unique_database0] > __ > tests/stress/test_acid_stress.py:387: in test_failing_inserts > self._run_test_failing_inserts(unique_database, is_partitioned) > tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts > run_tasks(writers + checkers) > tests/stress/stress_util.py:45: in run_tasks > pool.map_async(Task.run, tasks).get(timeout_seconds) > ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: > in get > raise self._value > E assert 1 == 0 > E+ where 1 = len(['-1']) > E+where ['-1'] = object at 0x7f62144b6890>.data > - Captured stderr setup > -- > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > SET sync_ddl=True; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; > -- 2022-03-16 12:03:05,084 INFO MainThread: Started query > ee4cd7bba1374e44:5133f3bb > SET > client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; > SET sync_ddl=True; > -- executing against localhost:21000 > CREATE DATABASE `test_failing_inserts_b980fc6`; > -- 2022-03-16 12:03:05,179 INFO MainThread: Started query > d949fdad1d1e9d19:430dd1ca > -- 2022-03-16 12:03:11,071 INFO MainThread: Created database > "test_failing_inserts_b980fc6" for test ID >
[jira] [Created] (IMPALA-11191) Test insert failing ACID tests are broken in local catalog mode
Qifan Chen created IMPALA-11191: --- Summary: Test insert failing ACID tests are broken in local catalog mode Key: IMPALA-11191 URL: https://issues.apache.org/jira/browse/IMPALA-11191 Project: IMPALA Issue Type: Bug Reporter: Qifan Chen This test can fail in local catalog mode with rows of value -1 being returned which should not happen. To reproduce, comment out the test test_concurrent_inserts first as it can fail as reported in another JIRA IMPALA-11189. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: {code:java} Main branch failed on test est_failing_inserts [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 4 items tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] PASSED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] FAILED short test summary info = FAIL tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] FAILURES _ TestFailingAcidInserts.test_failing_inserts[unique_database0] __ tests/stress/test_acid_stress.py:387: in test_failing_inserts self._run_test_failing_inserts(unique_database, is_partitioned) tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E assert 1 == 0 E+ where 1 = len(['-1']) E+where ['-1'] = .data - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; -- 2022-03-16 12:03:05,084 INFO MainThread: Started query ee4cd7bba1374e44:5133f3bb SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_failing_inserts_b980fc6`; -- 2022-03-16 12:03:05,179 INFO MainThread: Started query d949fdad1d1e9d19:430dd1ca -- 2022-03-16 12:03:11,071 INFO MainThread: Created database "test_failing_inserts_b980fc6" for test ID "stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_failing_inserts_b980fc6.test_inserts_fail; -- 2022-03-16 12:03:11,073 INFO MainThread: Started query 1742bdbc8e07861b:6a973ebe -- executing against localhost:21000 create table test_failing_inserts_b980fc6.test_inserts_fail (i int) TBLPROPERTIES ( 'transactional_properties' = 'insert_only',
[jira] [Created] (IMPALA-11191) Test insert failing ACID tests are broken in local catalog mode
Qifan Chen created IMPALA-11191: --- Summary: Test insert failing ACID tests are broken in local catalog mode Key: IMPALA-11191 URL: https://issues.apache.org/jira/browse/IMPALA-11191 Project: IMPALA Issue Type: Bug Reporter: Qifan Chen This test can fail in local catalog mode with rows of value -1 being returned which should not happen. To reproduce, comment out the test test_concurrent_inserts first as it can fail as reported in another JIRA IMPALA-11189. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: {code:java} Main branch failed on test est_failing_inserts [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 4 items tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] PASSED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] FAILED short test summary info = FAIL tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] FAILURES _ TestFailingAcidInserts.test_failing_inserts[unique_database0] __ tests/stress/test_acid_stress.py:387: in test_failing_inserts self._run_test_failing_inserts(unique_database, is_partitioned) tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E assert 1 == 0 E+ where 1 = len(['-1']) E+where ['-1'] = .data - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; -- 2022-03-16 12:03:05,084 INFO MainThread: Started query ee4cd7bba1374e44:5133f3bb SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_failing_inserts_b980fc6`; -- 2022-03-16 12:03:05,179 INFO MainThread: Started query d949fdad1d1e9d19:430dd1ca -- 2022-03-16 12:03:11,071 INFO MainThread: Created database "test_failing_inserts_b980fc6" for test ID "stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_failing_inserts_b980fc6.test_inserts_fail; -- 2022-03-16 12:03:11,073 INFO MainThread: Started query 1742bdbc8e07861b:6a973ebe -- executing against localhost:21000 create table test_failing_inserts_b980fc6.test_inserts_fail (i int) TBLPROPERTIES ( 'transactional_properties' = 'insert_only',
[jira] [Created] (IMPALA-11190) Test failing insert tests are broken in local catalog mode
Qifan Chen created IMPALA-11190: --- Summary: Test failing insert tests are broken in local catalog mode Key: IMPALA-11190 URL: https://issues.apache.org/jira/browse/IMPALA-11190 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail repeatedly in local catalog mode. The concurrent checker query (select * from ) can return rows of value -1 which should not happen. This is reproducible by commenting out test TestFailingAcidInserts in tests/stress/test_acid_stress.py. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: Main branch failed on test est_failing_inserts [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 4 items tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] PASSED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] FAILED short test summary info = FAIL tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] FAILURES _ TestFailingAcidInserts.test_failing_inserts[unique_database0] __ tests/stress/test_acid_stress.py:387: in test_failing_inserts self._run_test_failing_inserts(unique_database, is_partitioned) tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E assert 1 == 0 E+ where 1 = len(['-1']) E+where ['-1'] = .data - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; -- 2022-03-16 12:03:05,084 INFO MainThread: Started query ee4cd7bba1374e44:5133f3bb SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_failing_inserts_b980fc6`; -- 2022-03-16 12:03:05,179 INFO MainThread: Started query d949fdad1d1e9d19:430dd1ca -- 2022-03-16 12:03:11,071 INFO MainThread: Created database "test_failing_inserts_b980fc6" for test ID "stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_failing_inserts_b980fc6.test_inserts_fail; -- 2022-03-16 12:03:11,073 INFO MainThread: Started query 1742bdbc8e07861b:6a973ebe -- executing against localhost:21000 create table
[jira] [Created] (IMPALA-11190) Test failing insert tests are broken in local catalog mode
Qifan Chen created IMPALA-11190: --- Summary: Test failing insert tests are broken in local catalog mode Key: IMPALA-11190 URL: https://issues.apache.org/jira/browse/IMPALA-11190 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail repeatedly in local catalog mode. The concurrent checker query (select * from ) can return rows of value -1 which should not happen. This is reproducible by commenting out test TestFailingAcidInserts in tests/stress/test_acid_stress.py. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: Main branch failed on test est_failing_inserts [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 4 items tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts PASSED tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0] PASSED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] FAILED short test summary info = FAIL tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] FAILURES _ TestFailingAcidInserts.test_failing_inserts[unique_database0] __ tests/stress/test_acid_stress.py:387: in test_failing_inserts self._run_test_failing_inserts(unique_database, is_partitioned) tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E assert 1 == 0 E+ where 1 = len(['-1']) E+where ['-1'] = .data - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE; -- 2022-03-16 12:03:05,084 INFO MainThread: Started query ee4cd7bba1374e44:5133f3bb SET client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_failing_inserts_b980fc6`; -- 2022-03-16 12:03:05,179 INFO MainThread: Started query d949fdad1d1e9d19:430dd1ca -- 2022-03-16 12:03:11,071 INFO MainThread: Created database "test_failing_inserts_b980fc6" for test ID "stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_failing_inserts_b980fc6.test_inserts_fail; -- 2022-03-16 12:03:11,073 INFO MainThread: Started query 1742bdbc8e07861b:6a973ebe -- executing against localhost:21000 create table
[jira] [Created] (IMPALA-11189) Concurrent insert ACL tests are broken in local catalog mode
Qifan Chen created IMPALA-11189: --- Summary: Concurrent insert ACL tests are broken in local catalog mode Key: IMPALA-11189 URL: https://issues.apache.org/jira/browse/IMPALA-11189 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) fail repeatedly in local catalog mode. The concurrent checker query (select * from ) can return duplicated rows such as reported below, where row [0,2] is duplicated. This can be reproduced quite easily by running the test (i.e., TestConcurrentAcidInserts) first, via commenting out all the tests prior to it in the test file tests/stress/test_acid_stress.py. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: {code:java} 09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 2 items tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0] FAILED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] PASSED short test summary info = FAIL tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0] FAILURES __ TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] ___ tests/stress/test_acid_stress.py:307: in test_concurrent_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E AssertionError: wid: 2 E assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4] E At index 3 diff: 2 != 3 E Left contains more items, first extra item: 4 E Full diff: E - [0, 1, 2, 2, 3, 4] E ? --- E + [0, 1, 2, 3, 4] - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE; -- 2022-03-16 09:20:54,808 INFO MainThread: Started query 28457f4c7e77cdec:c6d37319 SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_concurrent_inserts_8933345c`; -- 2022-03-16 09:20:54,877 INFO MainThread: Started query 374bf99aea680523:48d24054 -- 2022-03-16 09:21:01,164 INFO MainThread: Created database "test_concurrent_inserts_8933345c" for test ID "stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts; -- 2022-03-16 09:21:01,173 INFO MainThread: Started query 20480c2a1d336d35:c2d84edd -- executing against localhost:21000 create table
[jira] [Created] (IMPALA-11189) Concurrent insert ACL tests are broken in local catalog mode
Qifan Chen created IMPALA-11189: --- Summary: Concurrent insert ACL tests are broken in local catalog mode Key: IMPALA-11189 URL: https://issues.apache.org/jira/browse/IMPALA-11189 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Qifan Chen Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) fail repeatedly in local catalog mode. The concurrent checker query (select * from ) can return duplicated rows such as reported below, where row [0,2] is duplicated. This can be reproduced quite easily by running the test (i.e., TestConcurrentAcidInserts) first, via commenting out all the tests prior to it in the test file tests/stress/test_acid_stress.py. Setup: 1. Build the impala and clear HMS in case in a bad state: $IMPALA_HOME/buildall.sh -format_metastore -notests 2. Start the cluster in local catalog mode: $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal --catalogd_args --hms_event_polling_interval_s=1 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test $IMPALA_TESTS/stress/test_acid_stress.py Error reported: {code:java} 09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress rootLoggerLevel = INFO == test session starts === platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python cachedir: tests/.cache rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2 timeout: 7200s method: signal collected 2 items tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0] FAILED tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0] PASSED short test summary info = FAIL tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0] FAILURES __ TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] ___ tests/stress/test_acid_stress.py:307: in test_concurrent_inserts run_tasks(writers + checkers) tests/stress/stress_util.py:45: in run_tasks pool.map_async(Task.run, tasks).get(timeout_seconds) ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get raise self._value E AssertionError: wid: 2 E assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4] E At index 3 diff: 2 != 3 E Left contains more items, first extra item: 4 E Full diff: E - [0, 1, 2, 2, 3, 4] E ? --- E + [0, 1, 2, 3, 4] - Captured stderr setup -- SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE; -- 2022-03-16 09:20:54,808 INFO MainThread: Started query 28457f4c7e77cdec:c6d37319 SET client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_concurrent_inserts_8933345c`; -- 2022-03-16 09:20:54,877 INFO MainThread: Started query 374bf99aea680523:48d24054 -- 2022-03-16 09:21:01,164 INFO MainThread: Created database "test_concurrent_inserts_8933345c" for test ID "stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]" -- Captured stderr call -- SET SYNC_DDL=true; -- executing against localhost:21000 drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts; -- 2022-03-16 09:21:01,173 INFO MainThread: Started query 20480c2a1d336d35:c2d84edd -- executing against localhost:21000 create table
[jira] [Created] (IMPALA-11163) To scan small dimensional tables, the number of nodes selected by FE can be less
Qifan Chen created IMPALA-11163: --- Summary: To scan small dimensional tables, the number of nodes selected by FE can be less Key: IMPALA-11163 URL: https://issues.apache.org/jira/browse/IMPALA-11163 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen In Impala, FE determines the # of exec nodes to use for scan based on the # of local/remote nodes hosting data blocks. For example for a dimensional table, assume its #local nodes = 3, and its #remote nodes = 17. Then # of exec nodes for scan is 20. The final value is min(20, #exec nodes in cluster). In the case of a partitioned join(f, d) where f is the fact table and d is the dimensional table, the # of network opens from join to table d can be made less (say 2 instead of 20). Therefore, the system can handle more # of queries. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11163) To scan small dimensional tables, the number of nodes selected by FE can be less
Qifan Chen created IMPALA-11163: --- Summary: To scan small dimensional tables, the number of nodes selected by FE can be less Key: IMPALA-11163 URL: https://issues.apache.org/jira/browse/IMPALA-11163 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen In Impala, FE determines the # of exec nodes to use for scan based on the # of local/remote nodes hosting data blocks. For example for a dimensional table, assume its #local nodes = 3, and its #remote nodes = 17. Then # of exec nodes for scan is 20. The final value is min(20, #exec nodes in cluster). In the case of a partitioned join(f, d) where f is the fact table and d is the dimensional table, the # of network opens from join to table d can be made less (say 2 instead of 20). Therefore, the system can handle more # of queries. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (IMPALA-10754) test_overlap_min_max_filters_on_sorted_columns failed during GVO
[ https://issues.apache.org/jira/browse/IMPALA-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10754. - Resolution: Fixed > test_overlap_min_max_filters_on_sorted_columns failed during GVO > > > Key: IMPALA-10754 > URL: https://issues.apache.org/jira/browse/IMPALA-10754 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > Fix For: Impala 4.1.0 > > > test_overlap_min_max_filters_on_sorted_columns failed in the following build: > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4338/testReport/ > *Stack trace:* > {noformat} > query_test/test_runtime_filters.py:296: in > test_overlap_min_max_filters_on_sorted_columns > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:734: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over NumRuntimeFilteredPages did not > match expected results. > E EXPECTED VALUE: > E 58 > E > E > E ACTUAL VALUE: > E 59 > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10754) test_overlap_min_max_filters_on_sorted_columns failed during GVO
[ https://issues.apache.org/jira/browse/IMPALA-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10754. - Resolution: Fixed > test_overlap_min_max_filters_on_sorted_columns failed during GVO > > > Key: IMPALA-10754 > URL: https://issues.apache.org/jira/browse/IMPALA-10754 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > Fix For: Impala 4.1.0 > > > test_overlap_min_max_filters_on_sorted_columns failed in the following build: > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4338/testReport/ > *Stack trace:* > {noformat} > query_test/test_runtime_filters.py:296: in > test_overlap_min_max_filters_on_sorted_columns > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:734: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over NumRuntimeFilteredPages did not > match expected results. > E EXPECTED VALUE: > E 58 > E > E > E ACTUAL VALUE: > E 59 > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (IMPALA-11047) Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0
[ https://issues.apache.org/jira/browse/IMPALA-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11047. - Target Version: Impala 4.0.0 Resolution: Fixed > Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if > PARQUET_READ_STATISTICS=0 > -- > > Key: IMPALA-11047 > URL: https://issues.apache.org/jira/browse/IMPALA-11047 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0.0 >Reporter: Riza Suminto >Assignee: Qifan Chen >Priority: Major > > There is a conflict happening in HdfsScanNode.java vs > RuntimeFilterGenerator.java when initializing overlap predicate. > In HdfsScanNode.java, computeStatsTupleAndConjuncts that init statsTuple_ > will not be called because PARQUET_READ_STATISTICS=0. > [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L409] > > On the other hand, in RuntimeFilterGenerator.java, disable_overlap_filter is > set to false without considering what is the value of PARQUET_READ_STATISTICS. > [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L915] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11047) Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0
[ https://issues.apache.org/jira/browse/IMPALA-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11047. - Target Version: Impala 4.0.0 Resolution: Fixed > Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if > PARQUET_READ_STATISTICS=0 > -- > > Key: IMPALA-11047 > URL: https://issues.apache.org/jira/browse/IMPALA-11047 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0.0 >Reporter: Riza Suminto >Assignee: Qifan Chen >Priority: Major > > There is a conflict happening in HdfsScanNode.java vs > RuntimeFilterGenerator.java when initializing overlap predicate. > In HdfsScanNode.java, computeStatsTupleAndConjuncts that init statsTuple_ > will not be called because PARQUET_READ_STATISTICS=0. > [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L409] > > On the other hand, in RuntimeFilterGenerator.java, disable_overlap_filter is > set to false without considering what is the value of PARQUET_READ_STATISTICS. > [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L915] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
[ https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11132. - Target Version: Impala 4.0.0 Resolution: Fixed > Front-end test PlannerTest.testResourceRequirements can fail > > > Key: IMPALA-11132 > URL: https://issues.apache.org/jira/browse/IMPALA-11132 > Project: IMPALA > Issue Type: Test >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The test miscalculates per-host memory requirements, apparently due to an > incorrect HBase cardinality estimate: > {code:java} > Section DISTRIBUTEDPLAN of query: > select * from functional_hbase.alltypessmall > Actual does not match expected result: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB > thread-reservation=1 > ^^ > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=28.57K > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=28.57K >in pipelines: 00(GETNEXT) > Expected: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB > thread-reservation=1 > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=50 > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=50 >in pipelines: 00(GETNEXT) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
[ https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11132. - Target Version: Impala 4.0.0 Resolution: Fixed > Front-end test PlannerTest.testResourceRequirements can fail > > > Key: IMPALA-11132 > URL: https://issues.apache.org/jira/browse/IMPALA-11132 > Project: IMPALA > Issue Type: Test >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The test miscalculates per-host memory requirements, apparently due to an > incorrect HBase cardinality estimate: > {code:java} > Section DISTRIBUTEDPLAN of query: > select * from functional_hbase.alltypessmall > Actual does not match expected result: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB > thread-reservation=1 > ^^ > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=28.57K > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=28.57K >in pipelines: 00(GETNEXT) > Expected: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB > thread-reservation=1 > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=50 > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=50 >in pipelines: 00(GETNEXT) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
[ https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17496958#comment-17496958 ] Qifan Chen commented on IMPALA-11132: - The estimation for the # of rows in HBase table scan is not capped by the # of rows from HMS when available. > Front-end test PlannerTest.testResourceRequirements can fail > > > Key: IMPALA-11132 > URL: https://issues.apache.org/jira/browse/IMPALA-11132 > Project: IMPALA > Issue Type: Test >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The test miscalculates per-host memory requirements, apparently due to an > incorrect HBase cardinality estimate: > {code:java} > Section DISTRIBUTEDPLAN of query: > select * from functional_hbase.alltypessmall > Actual does not match expected result: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB > thread-reservation=1 > ^^ > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=28.57K > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=28.57K >in pipelines: 00(GETNEXT) > Expected: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB > thread-reservation=1 > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=50 > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=50 >in pipelines: 00(GETNEXT) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11146) Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing
Qifan Chen created IMPALA-11146: --- Summary: Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing Key: IMPALA-11146 URL: https://issues.apache.org/jira/browse/IMPALA-11146 Project: IMPALA Issue Type: Test Reporter: Qifan Chen In some of the tests, the following string is missing: {code:java} Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 {code} This is seen in quite number of test flavors: exhaustive-release, core-asan, core-ubsan and core-s3. {code:java} Stacktrace query_test/test_insert.py:168: in test_acid_insert_fail multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1) common/impala_test_suite.py:732: in run_test_case self.__verify_exceptions(test_section['CATCH'], str(e), use_db) common/impala_test_suite.py:537: in __verify_exceptions (expected_str, actual_str) E AssertionError: Unexpected exception string. Expected: Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 E Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: ParseException: Syntax error in line 1:...DFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 ^Encountered: :ExpectedCAUSED BY: Exception: Syntax error Standard Error SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_acid_insert_fail_5388d22e` CASCADE; -- 2022-02-15 08:57:34,223 INFO MainThread: Started query c7422e292e8aeaf8:e18e9fcb SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_acid_insert_fail_5388d22e`; -- 2022-02-15 08:57:40,220 INFO MainThread: Started query 6f4315ed0ab15d9e:a0d14b2d -- 2022-02-15 08:57:46,231 INFO MainThread: Created database "test_acid_insert_fail_5388d22e" for test ID "query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec: none | protocol: beeswax | exec_option: {'sync_ddl': 0, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none-unique_database0]" SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; -- executing against localhost:21000 use test_acid_insert_fail_5388d22e; -- 2022-02-15 08:57:46,235 INFO MainThread: Started query 7f4f6a493e47671e:9aacd50e SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=0; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=True; SET abort_on_error=1; SET exec_single_node_rows_threshold=0; -- 2022-02-15 08:57:46,237 INFO MainThread: Loading query test file: /data/jenkins/workspace/impala-cdw-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/acid-insert-fail.test -- executing against localhost:21000 create table insertonly_acid (i int) tblproperties('transactional'='true', 'transactional_properties'='insert_only'); -- 2022-02-15 08:57:48,587 INFO MainThread: Started query 6445bb56ec7a5801:df8df55a -- executing against localhost:21000 insert into insertonly_acid values (1), (2); -- 2022-02-15 08:57:54,252 INFO MainThread: Started query 51451860e01e6ff3:c29b68c2 -- executing against localhost:21000 select * from insertonly_acid; -- 2022-02-15 08:57:54,357 INFO MainThread: Started query 5b42253f96ee1ce7:71267b47 -- executing against localhost:21000 set DEBUG_ACTION=FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0; -- executing against localhost:21000 SET DEBUG_ACTION=""; -- 2022-02-15 08:57:54,420 INFO MainThread: Started query 044c2e00732016c8:5cf949a3 {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To
[jira] [Created] (IMPALA-11146) Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing
Qifan Chen created IMPALA-11146: --- Summary: Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing Key: IMPALA-11146 URL: https://issues.apache.org/jira/browse/IMPALA-11146 Project: IMPALA Issue Type: Test Reporter: Qifan Chen In some of the tests, the following string is missing: {code:java} Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 {code} This is seen in quite number of test flavors: exhaustive-release, core-asan, core-ubsan and core-s3. {code:java} Stacktrace query_test/test_insert.py:168: in test_acid_insert_fail multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1) common/impala_test_suite.py:732: in run_test_case self.__verify_exceptions(test_section['CATCH'], str(e), use_db) common/impala_test_suite.py:537: in __verify_exceptions (expected_str, actual_str) E AssertionError: Unexpected exception string. Expected: Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 E Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: ParseException: Syntax error in line 1:...DFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 ^Encountered: :ExpectedCAUSED BY: Exception: Syntax error Standard Error SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=True; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_acid_insert_fail_5388d22e` CASCADE; -- 2022-02-15 08:57:34,223 INFO MainThread: Started query c7422e292e8aeaf8:e18e9fcb SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=True; -- executing against localhost:21000 CREATE DATABASE `test_acid_insert_fail_5388d22e`; -- 2022-02-15 08:57:40,220 INFO MainThread: Started query 6f4315ed0ab15d9e:a0d14b2d -- 2022-02-15 08:57:46,231 INFO MainThread: Created database "test_acid_insert_fail_5388d22e" for test ID "query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec: none | protocol: beeswax | exec_option: {'sync_ddl': 0, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none-unique_database0]" SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; -- executing against localhost:21000 use test_acid_insert_fail_5388d22e; -- 2022-02-15 08:57:46,235 INFO MainThread: Started query 7f4f6a493e47671e:9aacd50e SET client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl; SET sync_ddl=0; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=True; SET abort_on_error=1; SET exec_single_node_rows_threshold=0; -- 2022-02-15 08:57:46,237 INFO MainThread: Loading query test file: /data/jenkins/workspace/impala-cdw-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/acid-insert-fail.test -- executing against localhost:21000 create table insertonly_acid (i int) tblproperties('transactional'='true', 'transactional_properties'='insert_only'); -- 2022-02-15 08:57:48,587 INFO MainThread: Started query 6445bb56ec7a5801:df8df55a -- executing against localhost:21000 insert into insertonly_acid values (1), (2); -- 2022-02-15 08:57:54,252 INFO MainThread: Started query 51451860e01e6ff3:c29b68c2 -- executing against localhost:21000 select * from insertonly_acid; -- 2022-02-15 08:57:54,357 INFO MainThread: Started query 5b42253f96ee1ce7:71267b47 -- executing against localhost:21000 set DEBUG_ACTION=FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0; -- executing against localhost:21000 SET DEBUG_ACTION=""; -- 2022-02-15 08:57:54,420 INFO MainThread: Started query 044c2e00732016c8:5cf949a3 {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
[ https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-11132: --- Assignee: Qifan Chen > Front-end test PlannerTest.testResourceRequirements can fail > > > Key: IMPALA-11132 > URL: https://issues.apache.org/jira/browse/IMPALA-11132 > Project: IMPALA > Issue Type: Test >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The test miscalculates per-host memory requirements, apparently due to an > incorrect HBase cardinality estimate: > {code:java} > Section DISTRIBUTEDPLAN of query: > select * from functional_hbase.alltypessmall > Actual does not match expected result: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB > thread-reservation=1 > ^^ > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=28.57K > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=28.57K >in pipelines: 00(GETNEXT) > Expected: > Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 > Per-Host Resource Estimates: Memory=10MB > Codegen disabled by planner > Analyzed query: SELECT * FROM functional_hbase.alltypessmall > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB > thread-reservation=1 > PLAN-ROOT SINK > | output exprs: functional_hbase.alltypessmall.id, > functional_hbase.alltypessmall.bigint_col, > functional_hbase.alltypessmall.bool_col, > functional_hbase.alltypessmall.date_string_col, > functional_hbase.alltypessmall.double_col, > functional_hbase.alltypessmall.float_col, > functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, > functional_hbase.alltypessmall.smallint_col, > functional_hbase.alltypessmall.string_col, > functional_hbase.alltypessmall.timestamp_col, > functional_hbase.alltypessmall.tinyint_col, > functional_hbase.alltypessmall.year > | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB > thread-reservation=0 > | > 01:EXCHANGE [UNPARTITIONED] > | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 > | tuple-ids=0 row-size=89B cardinality=50 > | in pipelines: 00(GETNEXT) > | > F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B > thread-reservation=1 > 00:SCAN HBASE [functional_hbase.alltypessmall] >stored statistics: > table: rows=100 > columns: all >mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 >tuple-ids=0 row-size=89B cardinality=50 >in pipelines: 00(GETNEXT) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
Qifan Chen created IMPALA-11132: --- Summary: Front-end test PlannerTest.testResourceRequirements can fail Key: IMPALA-11132 URL: https://issues.apache.org/jira/browse/IMPALA-11132 Project: IMPALA Issue Type: Test Reporter: Qifan Chen The test miscalculates per-host memory requirements, apparently due to an incorrect HBase cardinality estimate: {code:java} Section DISTRIBUTEDPLAN of query: select * from functional_hbase.alltypessmall Actual does not match expected result: Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_hbase.alltypessmall F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB thread-reservation=1 ^^ PLAN-ROOT SINK | output exprs: functional_hbase.alltypessmall.id, functional_hbase.alltypessmall.bigint_col, functional_hbase.alltypessmall.bool_col, functional_hbase.alltypessmall.date_string_col, functional_hbase.alltypessmall.double_col, functional_hbase.alltypessmall.float_col, functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, functional_hbase.alltypessmall.smallint_col, functional_hbase.alltypessmall.string_col, functional_hbase.alltypessmall.timestamp_col, functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 01:EXCHANGE [UNPARTITIONED] | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 | tuple-ids=0 row-size=89B cardinality=28.57K | in pipelines: 00(GETNEXT) | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1 00:SCAN HBASE [functional_hbase.alltypessmall] stored statistics: table: rows=100 columns: all mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 tuple-ids=0 row-size=89B cardinality=28.57K in pipelines: 00(GETNEXT) Expected: Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_hbase.alltypessmall F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1 PLAN-ROOT SINK | output exprs: functional_hbase.alltypessmall.id, functional_hbase.alltypessmall.bigint_col, functional_hbase.alltypessmall.bool_col, functional_hbase.alltypessmall.date_string_col, functional_hbase.alltypessmall.double_col, functional_hbase.alltypessmall.float_col, functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, functional_hbase.alltypessmall.smallint_col, functional_hbase.alltypessmall.string_col, functional_hbase.alltypessmall.timestamp_col, functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 01:EXCHANGE [UNPARTITIONED] | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 | tuple-ids=0 row-size=89B cardinality=50 | in pipelines: 00(GETNEXT) | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1 00:SCAN HBASE [functional_hbase.alltypessmall] stored statistics: table: rows=100 columns: all mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 tuple-ids=0 row-size=89B cardinality=50 in pipelines: 00(GETNEXT) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail
Qifan Chen created IMPALA-11132: --- Summary: Front-end test PlannerTest.testResourceRequirements can fail Key: IMPALA-11132 URL: https://issues.apache.org/jira/browse/IMPALA-11132 Project: IMPALA Issue Type: Test Reporter: Qifan Chen The test miscalculates per-host memory requirements, apparently due to an incorrect HBase cardinality estimate: {code:java} Section DISTRIBUTEDPLAN of query: select * from functional_hbase.alltypessmall Actual does not match expected result: Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_hbase.alltypessmall F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB thread-reservation=1 ^^ PLAN-ROOT SINK | output exprs: functional_hbase.alltypessmall.id, functional_hbase.alltypessmall.bigint_col, functional_hbase.alltypessmall.bool_col, functional_hbase.alltypessmall.date_string_col, functional_hbase.alltypessmall.double_col, functional_hbase.alltypessmall.float_col, functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, functional_hbase.alltypessmall.smallint_col, functional_hbase.alltypessmall.string_col, functional_hbase.alltypessmall.timestamp_col, functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 01:EXCHANGE [UNPARTITIONED] | mem-estimate=1.08MB mem-reservation=0B thread-reservation=0 | tuple-ids=0 row-size=89B cardinality=28.57K | in pipelines: 00(GETNEXT) | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1 00:SCAN HBASE [functional_hbase.alltypessmall] stored statistics: table: rows=100 columns: all mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 tuple-ids=0 row-size=89B cardinality=28.57K in pipelines: 00(GETNEXT) Expected: Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_hbase.alltypessmall F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1 PLAN-ROOT SINK | output exprs: functional_hbase.alltypessmall.id, functional_hbase.alltypessmall.bigint_col, functional_hbase.alltypessmall.bool_col, functional_hbase.alltypessmall.date_string_col, functional_hbase.alltypessmall.double_col, functional_hbase.alltypessmall.float_col, functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, functional_hbase.alltypessmall.smallint_col, functional_hbase.alltypessmall.string_col, functional_hbase.alltypessmall.timestamp_col, functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 01:EXCHANGE [UNPARTITIONED] | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0 | tuple-ids=0 row-size=89B cardinality=50 | in pipelines: 00(GETNEXT) | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1 00:SCAN HBASE [functional_hbase.alltypessmall] stored statistics: table: rows=100 columns: all mem-estimate=4.00KB mem-reservation=0B thread-reservation=0 tuple-ids=0 row-size=89B cardinality=50 in pipelines: 00(GETNEXT) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (IMPALA-11122) Handle false "Corrupted stats" warnings
Qifan Chen created IMPALA-11122: --- Summary: Handle false "Corrupted stats" warnings Key: IMPALA-11122 URL: https://issues.apache.org/jira/browse/IMPALA-11122 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen In at least one Parquet implementation, it is possible for the writer to write out parquet files whose row group contains 0 row. However, the file still contains footer metadata, so it is not 0 byte. In show table stats report, it is possible that some row in the #rows column is 0, while in the Size column a positive value (for number of bytes) is present. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (IMPALA-11122) Handle false "Corrupted stats" warnings
Qifan Chen created IMPALA-11122: --- Summary: Handle false "Corrupted stats" warnings Key: IMPALA-11122 URL: https://issues.apache.org/jira/browse/IMPALA-11122 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen In at least one Parquet implementation, it is possible for the writer to write out parquet files whose row group contains 0 row. However, the file still contains footer metadata, so it is not 0 byte. In show table stats report, it is possible that some row in the #rows column is 0, while in the Size column a positive value (for number of bytes) is present. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11090) Need a method to specify the current number of executors for an executor group
Qifan Chen created IMPALA-11090: --- Summary: Need a method to specify the current number of executors for an executor group Key: IMPALA-11090 URL: https://issues.apache.org/jira/browse/IMPALA-11090 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Currently, impalad accepts -num_expected_executors for the expected number of executors of a group. The feature is very useful in testing auto-scaling feature (IMPALA-10992). It will be very nice to specify the current number of executors as a new parameter to impalad. Both the current number of executors and the expected number of executors are important parameters for an executor group. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (IMPALA-11090) Need a method to specify the current number of executors for an executor group
Qifan Chen created IMPALA-11090: --- Summary: Need a method to specify the current number of executors for an executor group Key: IMPALA-11090 URL: https://issues.apache.org/jira/browse/IMPALA-11090 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen Currently, impalad accepts -num_expected_executors for the expected number of executors of a group. The feature is very useful in testing auto-scaling feature (IMPALA-10992). It will be very nice to specify the current number of executors as a new parameter to impalad. Both the current number of executors and the expected number of executors are important parameters for an executor group. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11006) Impalad crashes during query cancel tests
[ https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11006. - Fix Version/s: Impala 4.0.1 Resolution: Fixed > Impalad crashes during query cancel tests > - > > Key: IMPALA-11006 > URL: https://issues.apache.org/jira/browse/IMPALA-11006 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.0.1 > > > The following stack trace was observed in a core generated during S3 build. > {quote}Thread 485 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x0004 > rsi = 0x6698 rdi = 0x144f > rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 > r8 = 0xr9 = 0x7f89466b4d20 > r10 = 0x0008 r11 = 0x0202 > r12 = 0x07ea0100 r13 = 0x005a > r14 = 0x07ea0104 r15 = 0x07e98720 > rip = 0x7f8a65f091f7 > Found by: given as instruction pointer in context > 1 impalad!google::LogMessage::Flush() + 0x1eb > rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 > rip = 0x056aa19b > Found by: previous frame's frame pointer > 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52d0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x056add99 > Found by: call frame info > 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() > [client-request-state.cc : 1540 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52e0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286fef1 > Found by: call frame info > 4 impalad!impala::ClientRequestState::WaitInternal() > [client-request-state.cc : 1083 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b5550 > rsp = 0x7f89466b5400 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286ab3b > Found by: call frame info > 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : > 1010 + 0x19] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 > rsp = 0x7f89466b5560 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286a030 > Found by: call frame info > 6 impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const > [mem_fn_template.hpp : 49 + 0x5] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 > rsp = 0x7f89466b5650 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287faed > Found by: call frame info > 7 impalad!void > boost::_bi::list1 > >::operator(), > boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 > rsp = 0x7f89466b5670 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287f259 > Found by: call frame info > 8 impalad!boost::_bi::bind_t impala::ClientRequestState>, > boost::_bi::list1 > > >::operator()() [bind.hpp : 1222 + 0x22] > rbx = 0x6698 rbp = 0x7f89466b56f0 > rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287e2f3 > Found by: call frame info > 9 > impalad!boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf0, > boost::_bi::list1 > >, > void>::invoke(boost::detail::function::function_buffer&) > [function_template.hpp : 159 + 0xc] > rbx = 0x6698 rbp = 0x7f89466b5720 > rsp = 0x7f89466b5700 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287d07e > Found by: call frame info > 10 impalad!boost::function0::operator()() const [function_template.hpp > : 770 + 0x1d] > rbx = 0x6698 rbp = 0x7f89466b5760 > rsp = 0x7f89466b5730 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 =
[jira] [Resolved] (IMPALA-11006) Impalad crashes during query cancel tests
[ https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-11006. - Fix Version/s: Impala 4.0.1 Resolution: Fixed > Impalad crashes during query cancel tests > - > > Key: IMPALA-11006 > URL: https://issues.apache.org/jira/browse/IMPALA-11006 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.0.1 > > > The following stack trace was observed in a core generated during S3 build. > {quote}Thread 485 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x0004 > rsi = 0x6698 rdi = 0x144f > rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 > r8 = 0xr9 = 0x7f89466b4d20 > r10 = 0x0008 r11 = 0x0202 > r12 = 0x07ea0100 r13 = 0x005a > r14 = 0x07ea0104 r15 = 0x07e98720 > rip = 0x7f8a65f091f7 > Found by: given as instruction pointer in context > 1 impalad!google::LogMessage::Flush() + 0x1eb > rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 > rip = 0x056aa19b > Found by: previous frame's frame pointer > 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52d0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x056add99 > Found by: call frame info > 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() > [client-request-state.cc : 1540 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52e0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286fef1 > Found by: call frame info > 4 impalad!impala::ClientRequestState::WaitInternal() > [client-request-state.cc : 1083 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b5550 > rsp = 0x7f89466b5400 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286ab3b > Found by: call frame info > 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : > 1010 + 0x19] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 > rsp = 0x7f89466b5560 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286a030 > Found by: call frame info > 6 impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const > [mem_fn_template.hpp : 49 + 0x5] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 > rsp = 0x7f89466b5650 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287faed > Found by: call frame info > 7 impalad!void > boost::_bi::list1 > >::operator(), > boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 > rsp = 0x7f89466b5670 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287f259 > Found by: call frame info > 8 impalad!boost::_bi::bind_t impala::ClientRequestState>, > boost::_bi::list1 > > >::operator()() [bind.hpp : 1222 + 0x22] > rbx = 0x6698 rbp = 0x7f89466b56f0 > rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287e2f3 > Found by: call frame info > 9 > impalad!boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf0, > boost::_bi::list1 > >, > void>::invoke(boost::detail::function::function_buffer&) > [function_template.hpp : 159 + 0xc] > rbx = 0x6698 rbp = 0x7f89466b5720 > rsp = 0x7f89466b5700 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287d07e > Found by: call frame info > 10 impalad!boost::function0::operator()() const [function_template.hpp > : 770 + 0x1d] > rbx = 0x6698 rbp = 0x7f89466b5760 > rsp = 0x7f89466b5730 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 =
[jira] [Commented] (IMPALA-11006) Impalad crashes during query cancel tests
[ https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445375#comment-17445375 ] Qifan Chen commented on IMPALA-11006: - To reproduce, run the following script and hit control-C after the message showing the query monitoring URL is displayed. drop table if exists ctas_cancel; set debug_action=CRS_DELAY_BEFORE_CATALOG_OP_EXEC:SLEEP@1; create table ctas_cancel primary key (l_orderkey, l_partkey, l_suppkey, l_linenumber) partition by hash partitions 3 stored as kudu as select * from tpch_kudu.lineitem order by l_orderkey; > Impalad crashes during query cancel tests > - > > Key: IMPALA-11006 > URL: https://issues.apache.org/jira/browse/IMPALA-11006 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The following stack trace was observed in a core generated during S3 build. > {quote}Thread 485 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x0004 > rsi = 0x6698 rdi = 0x144f > rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 > r8 = 0xr9 = 0x7f89466b4d20 > r10 = 0x0008 r11 = 0x0202 > r12 = 0x07ea0100 r13 = 0x005a > r14 = 0x07ea0104 r15 = 0x07e98720 > rip = 0x7f8a65f091f7 > Found by: given as instruction pointer in context > 1 impalad!google::LogMessage::Flush() + 0x1eb > rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 > rip = 0x056aa19b > Found by: previous frame's frame pointer > 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52d0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x056add99 > Found by: call frame info > 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() > [client-request-state.cc : 1540 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52e0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286fef1 > Found by: call frame info > 4 impalad!impala::ClientRequestState::WaitInternal() > [client-request-state.cc : 1083 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b5550 > rsp = 0x7f89466b5400 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286ab3b > Found by: call frame info > 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : > 1010 + 0x19] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 > rsp = 0x7f89466b5560 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286a030 > Found by: call frame info > 6 impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const > [mem_fn_template.hpp : 49 + 0x5] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 > rsp = 0x7f89466b5650 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287faed > Found by: call frame info > 7 impalad!void > boost::_bi::list1 > >::operator(), > boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 > rsp = 0x7f89466b5670 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287f259 > Found by: call frame info > 8 impalad!boost::_bi::bind_t impala::ClientRequestState>, > boost::_bi::list1 > > >::operator()() [bind.hpp : 1222 + 0x22] > rbx = 0x6698 rbp = 0x7f89466b56f0 > rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287e2f3 > Found by: call frame info > 9 > impalad!boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf0, > boost::_bi::list1 > >, > void>::invoke(boost::detail::function::function_buffer&) > [function_template.hpp : 159 + 0xc] > rbx = 0x6698 rbp = 0x7f89466b5720 > rsp = 0x7f89466b5700 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 >
[jira] [Created] (IMPALA-11006) Impalad crashes during query cancel tests
Qifan Chen created IMPALA-11006: --- Summary: Impalad crashes during query cancel tests Key: IMPALA-11006 URL: https://issues.apache.org/jira/browse/IMPALA-11006 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Qifan Chen The following stack trace was observed in a core generated during S3 build. {quote}Thread 485 (crashed) 0 libc-2.17.so + 0x351f7 rax = 0x rdx = 0x0006 rcx = 0x rbx = 0x0004 rsi = 0x6698 rdi = 0x144f rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 r8 = 0xr9 = 0x7f89466b4d20 r10 = 0x0008 r11 = 0x0202 r12 = 0x07ea0100 r13 = 0x005a r14 = 0x07ea0104 r15 = 0x07e98720 rip = 0x7f8a65f091f7 Found by: given as instruction pointer in context 1 impalad!google::LogMessage::Flush() + 0x1eb rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 rip = 0x056aa19b Found by: previous frame's frame pointer 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 rbx = 0x0001 rbp = 0x7f89466b53f0 rsp = 0x7f89466b52d0 r12 = 0x07ea7638 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x056add99 Found by: call frame info 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() [client-request-state.cc : 1540 + 0xf] rbx = 0x0001 rbp = 0x7f89466b53f0 rsp = 0x7f89466b52e0 r12 = 0x07ea7638 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286fef1 Found by: call frame info 4 impalad!impala::ClientRequestState::WaitInternal() [client-request-state.cc : 1083 + 0xf] rbx = 0x0001 rbp = 0x7f89466b5550 rsp = 0x7f89466b5400 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286ab3b Found by: call frame info 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 1010 + 0x19] rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 rsp = 0x7f89466b5560 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286a030 Found by: call frame info 6 impalad!boost::_mfi::mf0::operator()(impala::ClientRequestState*) const [mem_fn_template.hpp : 49 + 0x5] rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 rsp = 0x7f89466b5650 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287faed Found by: call frame info 7 impalad!void boost::_bi::list1 >::operator(), boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 rsp = 0x7f89466b5670 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287f259 Found by: call frame info 8 impalad!boost::_bi::bind_t, boost::_bi::list1 > >::operator()() [bind.hpp : 1222 + 0x22] rbx = 0x6698 rbp = 0x7f89466b56f0 rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287e2f3 Found by: call frame info 9 impalad!boost::detail::function::void_function_obj_invoker0, boost::_bi::list1 > >, void>::invoke(boost::detail::function::function_buffer&) [function_template.hpp : 159 + 0xc] rbx = 0x6698 rbp = 0x7f89466b5720 rsp = 0x7f89466b5700 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287d07e Found by: call frame info 10 impalad!boost::function0::operator()() const [function_template.hpp : 770 + 0x1d] rbx = 0x6698 rbp = 0x7f89466b5760 rsp = 0x7f89466b5730 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x022e55c0 Found by: call frame info 11 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*) [thread.cc : 360 + 0xf] rbx = 0x6698 rbp = 0x7f89466b5af0 rsp = 0x7f89466b5770 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x02aab96a Found by: call frame info 12 impalad!void
[jira] [Assigned] (IMPALA-11006) Impalad crashes during query cancel tests
[ https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-11006: --- Assignee: Qifan Chen > Impalad crashes during query cancel tests > - > > Key: IMPALA-11006 > URL: https://issues.apache.org/jira/browse/IMPALA-11006 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > The following stack trace was observed in a core generated during S3 build. > {quote}Thread 485 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x0004 > rsi = 0x6698 rdi = 0x144f > rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 > r8 = 0xr9 = 0x7f89466b4d20 > r10 = 0x0008 r11 = 0x0202 > r12 = 0x07ea0100 r13 = 0x005a > r14 = 0x07ea0104 r15 = 0x07e98720 > rip = 0x7f8a65f091f7 > Found by: given as instruction pointer in context > 1 impalad!google::LogMessage::Flush() + 0x1eb > rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 > rip = 0x056aa19b > Found by: previous frame's frame pointer > 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52d0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x056add99 > Found by: call frame info > 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() > [client-request-state.cc : 1540 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b53f0 > rsp = 0x7f89466b52e0 r12 = 0x07ea7638 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286fef1 > Found by: call frame info > 4 impalad!impala::ClientRequestState::WaitInternal() > [client-request-state.cc : 1083 + 0xf] > rbx = 0x0001 rbp = 0x7f89466b5550 > rsp = 0x7f89466b5400 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286ab3b > Found by: call frame info > 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : > 1010 + 0x19] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 > rsp = 0x7f89466b5560 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0286a030 > Found by: call frame info > 6 impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const > [mem_fn_template.hpp : 49 + 0x5] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 > rsp = 0x7f89466b5650 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287faed > Found by: call frame info > 7 impalad!void > boost::_bi::list1 > >::operator(), > boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] > rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 > rsp = 0x7f89466b5670 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287f259 > Found by: call frame info > 8 impalad!boost::_bi::bind_t impala::ClientRequestState>, > boost::_bi::list1 > > >::operator()() [bind.hpp : 1222 + 0x22] > rbx = 0x6698 rbp = 0x7f89466b56f0 > rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287e2f3 > Found by: call frame info > 9 > impalad!boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf0, > boost::_bi::list1 > >, > void>::invoke(boost::detail::function::function_buffer&) > [function_template.hpp : 159 + 0xc] > rbx = 0x6698 rbp = 0x7f89466b5720 > rsp = 0x7f89466b5700 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x0287d07e > Found by: call frame info > 10 impalad!boost::function0::operator()() const [function_template.hpp > : 770 + 0x1d] > rbx = 0x6698 rbp = 0x7f89466b5760 > rsp = 0x7f89466b5730 r12 = 0x091cb7a0 > r13 = 0x7f89556db190 r14 = 0x16630d20 > r15 = 0x002c rip = 0x022e55c0 > Found by: call
[jira] [Created] (IMPALA-11006) Impalad crashes during query cancel tests
Qifan Chen created IMPALA-11006: --- Summary: Impalad crashes during query cancel tests Key: IMPALA-11006 URL: https://issues.apache.org/jira/browse/IMPALA-11006 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Qifan Chen The following stack trace was observed in a core generated during S3 build. {quote}Thread 485 (crashed) 0 libc-2.17.so + 0x351f7 rax = 0x rdx = 0x0006 rcx = 0x rbx = 0x0004 rsi = 0x6698 rdi = 0x144f rbp = 0x7f89466b5220 rsp = 0x7f89466b4ea8 r8 = 0xr9 = 0x7f89466b4d20 r10 = 0x0008 r11 = 0x0202 r12 = 0x07ea0100 r13 = 0x005a r14 = 0x07ea0104 r15 = 0x07e98720 rip = 0x7f8a65f091f7 Found by: given as instruction pointer in context 1 impalad!google::LogMessage::Flush() + 0x1eb rbp = 0x7f89466b5350 rsp = 0x7f89466b5230 rip = 0x056aa19b Found by: previous frame's frame pointer 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9 rbx = 0x0001 rbp = 0x7f89466b53f0 rsp = 0x7f89466b52d0 r12 = 0x07ea7638 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x056add99 Found by: call frame info 3 impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() [client-request-state.cc : 1540 + 0xf] rbx = 0x0001 rbp = 0x7f89466b53f0 rsp = 0x7f89466b52e0 r12 = 0x07ea7638 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286fef1 Found by: call frame info 4 impalad!impala::ClientRequestState::WaitInternal() [client-request-state.cc : 1083 + 0xf] rbx = 0x0001 rbp = 0x7f89466b5550 rsp = 0x7f89466b5400 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286ab3b Found by: call frame info 5 impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 1010 + 0x19] rbx = 0x7f89466b5b28 rbp = 0x7f89466b5640 rsp = 0x7f89466b5560 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0286a030 Found by: call frame info 6 impalad!boost::_mfi::mf0::operator()(impala::ClientRequestState*) const [mem_fn_template.hpp : 49 + 0x5] rbx = 0x7f89466b5b28 rbp = 0x7f89466b5660 rsp = 0x7f89466b5650 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287faed Found by: call frame info 7 impalad!void boost::_bi::list1 >::operator(), boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35] rbx = 0x7f89466b5b28 rbp = 0x7f89466b56a0 rsp = 0x7f89466b5670 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287f259 Found by: call frame info 8 impalad!boost::_bi::bind_t, boost::_bi::list1 > >::operator()() [bind.hpp : 1222 + 0x22] rbx = 0x6698 rbp = 0x7f89466b56f0 rsp = 0x7f89466b56b0 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287e2f3 Found by: call frame info 9 impalad!boost::detail::function::void_function_obj_invoker0, boost::_bi::list1 > >, void>::invoke(boost::detail::function::function_buffer&) [function_template.hpp : 159 + 0xc] rbx = 0x6698 rbp = 0x7f89466b5720 rsp = 0x7f89466b5700 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x0287d07e Found by: call frame info 10 impalad!boost::function0::operator()() const [function_template.hpp : 770 + 0x1d] rbx = 0x6698 rbp = 0x7f89466b5760 rsp = 0x7f89466b5730 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x022e55c0 Found by: call frame info 11 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*) [thread.cc : 360 + 0xf] rbx = 0x6698 rbp = 0x7f89466b5af0 rsp = 0x7f89466b5770 r12 = 0x091cb7a0 r13 = 0x7f89556db190 r14 = 0x16630d20 r15 = 0x002c rip = 0x02aab96a Found by: call frame info 12 impalad!void
[jira] [Resolved] (IMPALA-10967) Load data should handle AWS NLB-type timeout
[ https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10967. - Fix Version/s: Impala 4.1.0 Resolution: Fixed > Load data should handle AWS NLB-type timeout > > > Key: IMPALA-10967 > URL: https://issues.apache.org/jira/browse/IMPALA-10967 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.1.0 > > > Currently, since Impala handles the load data statement request in a single > thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if > the data loading takes more than 350s to complete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10967) Load data should handle AWS NLB-type timeout
[ https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10967. - Fix Version/s: Impala 4.1.0 Resolution: Fixed > Load data should handle AWS NLB-type timeout > > > Key: IMPALA-10967 > URL: https://issues.apache.org/jira/browse/IMPALA-10967 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.1.0 > > > Currently, since Impala handles the load data statement request in a single > thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if > the data loading takes more than 350s to complete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IMPALA-10992) Planner changes for estimate peak memory.
[ https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-10992: --- Assignee: Qifan Chen > Planner changes for estimate peak memory. > - > > Key: IMPALA-10992 > URL: https://issues.apache.org/jira/browse/IMPALA-10992 > Project: IMPALA > Issue Type: Task >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Major > > For ability to run large queries on larger executor group mapping to > different resource group, we would need to identify the large queries during > compile time. For this identification in first phase we can use peak memory > estimation to classify large queries. This Jira is to keep track of that > support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10811. - Target Version: Impala 4.1.0 Resolution: Fixed > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10811. - Target Version: Impala 4.1.0 Resolution: Fixed > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433755#comment-17433755 ] Qifan Chen commented on IMPALA-10811: - The major work was done in commit 975883c47035843398ee99a21fa132f67a0d4954. The remaining work on load data is separately tracked in IMPALA-10967. > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-10811: --- Assignee: Qifan Chen (was: Joe McDonnell) > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Assignee: Qifan Chen >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10967) Load data should handle AWS NLB-type timeout
[ https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-10967: --- Assignee: Qifan Chen > Load data should handle AWS NLB-type timeout > > > Key: IMPALA-10967 > URL: https://issues.apache.org/jira/browse/IMPALA-10967 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > Currently, since Impala handles the load data statement request in a single > thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if > the data loading takes more than 350s to complete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10967) Load data should handle AWS NLB-type timeout
Qifan Chen created IMPALA-10967: --- Summary: Load data should handle AWS NLB-type timeout Key: IMPALA-10967 URL: https://issues.apache.org/jira/browse/IMPALA-10967 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen Currently, since Impala handles the load data statement request in a single thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if the data loading takes more than 350s to complete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10967) Load data should handle AWS NLB-type timeout
Qifan Chen created IMPALA-10967: --- Summary: Load data should handle AWS NLB-type timeout Key: IMPALA-10967 URL: https://issues.apache.org/jira/browse/IMPALA-10967 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen Currently, since Impala handles the load data statement request in a single thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if the data loading takes more than 350s to complete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
[ https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen reassigned IMPALA-10927: --- Assignee: Qifan Chen > TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test > --- > > Key: IMPALA-10927 > URL: https://issues.apache.org/jira/browse/IMPALA-10927 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Major > > RowsSentRate counter is seen with a value of 0 and the following is the error > report. A fix in the similar area was described in IMPALA-8957, where a delay > via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. > {code:java} > query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] (from pytest) > Failing for the past 1 build (Since Failed#206 ) > Took 0.47 sec. > Error Message > query_test/test_fetch.py:101: in test_rows_sent_counters assert > float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = > float('0') E+where '0' = object at 0x7f9d22693030>(1) E+ where _sre.SRE_Match object at 0x7f9d22693030> = <_sre.SRE_Match object at > 0x7f9d22693030>.group > Stacktrace > query_test/test_fetch.py:101: in test_rows_sent_counters > assert float(rows_sent_rate.group(1)) > 0 > E assert 0.0 > 0 > E+ where 0.0 = float('0') > E+where '0' = 0x7f9d22693030>(1) > E+ where 0x7f9d22693030> = <_sre.SRE_Match object at 0x7f9d22693030>.group > Standard Error > SET > client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation > -- connecting to localhost:11050 with impyla > -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', > 11050, 0, 0) > -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of > [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] > -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup > failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), > ('127.0.0.1', 11050)] > SET > client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; > SET batch_size=0; > SET num_nodes=0; > SET disable_codegen_rows_threshold=0; > SET disable_codegen=False; > SET abort_on_error=1; > SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000; > SET exec_single_node_rows_threshold=0; > -- executing against localhost:21000 > select id from functional.alltypes limit 10; > -- 2021-09-10 04:41:56,988 INFO MainThread: Started query > b04941a75e31:1da6c8eb > Options > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
[ https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-10927: Description: RowsSentRate counter is seen with a value of 0 and the following is the error report. A fix in the similar area was described in IMPALA-8957, where a delay via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. {code:java} query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#206 ) Took 0.47 sec. Error Message query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Stacktrace query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Standard Error SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation -- connecting to localhost:11050 with impyla -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 11050, 0, 0) -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 select id from functional.alltypes limit 10; -- 2021-09-10 04:41:56,988 INFO MainThread: Started query b04941a75e31:1da6c8eb Options {code} was: RowsSentRate counter is seen with a value of 0 and the following is the error report. A fix in the similar area was described in IMPALA-8957, where a delay via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. {code:java} query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#206 ) Took 0.47 sec. Error Message query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Stacktrace query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Standard Error SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation -- connecting to localhost:11050 with impyla -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 11050, 0, 0) Traceback (most recent call last): File
[jira] [Updated] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
[ https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-10927: Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test (was: TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test) > TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test > --- > > Key: IMPALA-10927 > URL: https://issues.apache.org/jira/browse/IMPALA-10927 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Priority: Major > > RowsSentRate counter is seen with a value of 0 and the following is the error > report. A fix in the similar area was described in IMPALA-8957, where a delay > via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. > {code:java} > query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] (from pytest) > Failing for the past 1 build (Since Failed#206 ) > Took 0.47 sec. > Error Message > query_test/test_fetch.py:101: in test_rows_sent_counters assert > float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = > float('0') E+where '0' = object at 0x7f9d22693030>(1) E+ where _sre.SRE_Match object at 0x7f9d22693030> = <_sre.SRE_Match object at > 0x7f9d22693030>.group > Stacktrace > query_test/test_fetch.py:101: in test_rows_sent_counters > assert float(rows_sent_rate.group(1)) > 0 > E assert 0.0 > 0 > E+ where 0.0 = float('0') > E+where '0' = 0x7f9d22693030>(1) > E+ where 0x7f9d22693030> = <_sre.SRE_Match object at 0x7f9d22693030>.group > Standard Error > SET > client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation > -- connecting to localhost:11050 with impyla > -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', > 11050, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 104, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to > ('127.0.0.1', 11050) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 104, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of > [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] > -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup > failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), > ('127.0.0.1', 11050)] > SET > client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; > SET batch_size=0; > SET num_nodes=0; > SET disable_codegen_rows_threshold=0; > SET disable_codegen=False; > SET abort_on_error=1; > SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000; > SET exec_single_node_rows_threshold=0; > -- executing against localhost:21000 > select id from functional.alltypes limit 10; > -- 2021-09-10 04:41:56,988 INFO MainThread: Started query > b04941a75e31:1da6c8eb > Options > {code}
[jira] [Created] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test
Qifan Chen created IMPALA-10927: --- Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test Key: IMPALA-10927 URL: https://issues.apache.org/jira/browse/IMPALA-10927 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen RowsSentRate counter is seen with a value of 0 and the following is the error report. A fix in the similar area was described in IMPALA-8957, where a delay via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. {code:java} query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#206 ) Took 0.47 sec. Error Message query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Stacktrace query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Standard Error SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation -- connecting to localhost:11050 with impyla -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 11050, 0, 0) Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('127.0.0.1', 11050) Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 select id from functional.alltypes limit 10; -- 2021-09-10 04:41:56,988 INFO MainThread: Started query b04941a75e31:1da6c8eb Options {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test
Qifan Chen created IMPALA-10927: --- Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test Key: IMPALA-10927 URL: https://issues.apache.org/jira/browse/IMPALA-10927 Project: IMPALA Issue Type: Improvement Reporter: Qifan Chen RowsSentRate counter is seen with a value of 0 and the following is the error report. A fix in the similar area was described in IMPALA-8957, where a delay via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced. {code:java} query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#206 ) Took 0.47 sec. Error Message query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Stacktrace query_test/test_fetch.py:101: in test_rows_sent_counters assert float(rows_sent_rate.group(1)) > 0 E assert 0.0 > 0 E+ where 0.0 = float('0') E+where '0' = (1) E+ where = <_sre.SRE_Match object at 0x7f9d22693030>.group Standard Error SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation -- connecting to localhost:11050 with impyla -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 11050, 0, 0) Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('127.0.0.1', 11050) Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 104, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), ('127.0.0.1', 11050)] SET client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 select id from functional.alltypes limit 10; -- 2021-09-10 04:41:56,988 INFO MainThread: Started query b04941a75e31:1da6c8eb Options {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org