from:"Qifan Chen \(Jira\)"

[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly

2022-10-20 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621358#comment-17621358
 ] 

Qifan Chen commented on IMPALA-11653:
-

It seems a viable solution would be to augment connection_setup_pool which is a 
ThreadPool> as follows.

1. Allow more numbers of TAcceptQueueEntry in it. Say 
accepted_cnxn_setup_thread_pool_size + 10. 
2. Record the wait time each TAcceptQueueEntry in the queue;
3. Kick out entries with longest waiting time when new member is to be queued.  
For each kicked out entry, do a close() on it;
4. Remove entries in the queue after SetupConnection() completes normally so 
only the "super long waiting" items are in the queue.  

> Identify and time out connections that are not from a supported Impala client 
> more eagerly
> --
>
> Key: IMPALA-11653
> URL: https://issues.apache.org/jira/browse/IMPALA-11653
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.1.0
>Reporter: Vincent Tran
>Assignee: Qifan Chen
>Priority: Major
> Attachments: simple_tcp_client.py
>
>
> When a tcp client opens a connection to an Impala client interface (hs2 or 
> beeswax), the connection is accepted immediately after the 3-way handshake 
> (SYN, SYN-ACK, ACK) and is queued for 
> *TAcceptQueueServer::SetupConnection()*.  However, if the client sends 
> nothing else, the ImpalaServer will block in 
> *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN 
> or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 
> minutes).
> The connection setup thread stack trace can be observed below during this 
> period.
> {noformat}
> (gdb) bt
> #0  0x7f3b972ee20d in poll () from ./lib64/libc.so.6
> #1  0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned 
> char*, unsigned int) ()
> #2  0x02dd1803 in unsigned int 
> apache::thrift::transport::readAll(apache::thrift::transport::TSocket&,
>  unsigned char*, unsigned int) ()
> #3  0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", 
> this=) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121
> #4  apache::thrift::transport::TSaslTransport::receiveSaslMessage 
> (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, 
> length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259
> #5  0x0132db14 in 
> apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage 
> (this=0x278a96b0) at TSaslServerTransport.cpp:95
> #6  0x01330e33 in 
> apache::thrift::transport::TSaslTransport::doSaslNegotiation 
> (this=0x278a96b0) at TSaslTransport.cpp:81
> #7  0x0132e723 in open (this=0x12e29750) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218
> #8  apache::thrift::transport::TSaslServerTransport::Factory::getTransport 
> (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173
> #9  0x010cd49d in 
> apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, 
> entry=...) at TAcceptQueueServer.cpp:233
> #10 0x010cef4d in operator() (tid=, item=..., 
> __closure=) at TAcceptQueueServer.cpp:323
> #11 
> boost::detail::function::void_function_obj_invoker2  const boost::shared_ptr&)>, void, 
> int, const 
> boost::shared_ptr&>::invoke(boost::detail::function::function_buffer
>  &, int, const boost::shared_ptr 
> &) (function_obj_ptr=..., a0=, a1=...)
> at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159
> #12 0x010d3e59 in operator() (a1=..., a0=1, this=0x7f3279ea9510) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770
> #13 
> impala::ThreadPool
>  >::WorkerThread (this=0x7f3279ea94c0, thread_id=1) at 
> ../util/thread-pool.h:166
> #14 0x0144f8f2 in operator() (this=0x7f3277ea5b40) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770
> #15 impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, 
> std::__cxx11::basic_string, std::allocator 
> > const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., 
> functor=..., parent_thread_info=, 
> thread_started=0x7f3279ea9110) at thread.cc:360
> #16 0x01450d6b in operator() std::__cxx11::basic_string&, const std::__cxx11::basic_string&, 
> boost::function, const impala::ThreadDebugInfo*, impala::Promise int>*), boost::_bi::list0> (a=,
> f=@0x1417ccf8: 0x144f5f0 
>  std::char_traits, std::allocator > const&, 
>

[jira] [Comment Edited] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly

2022-10-20 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621353#comment-17621353
 ] 

Qifan Chen edited comment on IMPALA-11653 at 10/20/22 8:17 PM:
---

Before the very first communication with the client, impalad needs to setup the 
connection by calling SetupConnection(). It is in this setup method that the 
above "super long wait" can happen. 


{code:java}
(gdb) bt
#0  0x7f3b972ee20d in poll () from ./lib64/libc.so.6
#1  0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned 
char*, unsigned int) ()
#2  0x02dd1803 in unsigned int 
apache::thrift::transport::readAll(apache::thrift::transport::TSocket&,
 unsigned char*, unsigned int) ()
#3  0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", 
this=) at 
../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121
#4  apache::thrift::transport::TSaslTransport::receiveSaslMessage 
(this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, 
length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259
#5  0x0132db14 in 
apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage 
(this=0x278a96b0) at TSaslServerTransport.cpp:95
#6  0x01330e33 in 
apache::thrift::transport::TSaslTransport::doSaslNegotiation (this=0x278a96b0) 
at TSaslTransport.cpp:81
#7  0x0132e723 in open (this=0x12e29750) at 
../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218.
   void open() { transport_->open(); }

132   void TSaslTransport::open() { 
   
133 // Only client should open the underlying transport.
   
134 if (isClient_ && !transport_->isOpen()) {   
   
135   transport_->open();   
   
136 }   
   
137 
   
138 // Start the SASL negotiation protocol. 
   
139 doSaslNegotiation();
   
140   } 

#8  apache::thrift::transport::TSaslServerTransport::Factory::getTransport 
(this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173
#9  0x010cd49d in 
apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, 
entry=...) at TAcceptQueueServer.cpp:233
#10 0x010cef4d in operator() 
{code}




{code:java}
130 std::shared_ptr TSaslServerTransport::Factory::getTransport(
   
131 std::shared_ptr trans) {
   
132   // Thrift servers use both an input and an output transport to 
communicate with  
133   // clients. In principal, these can be different, but for SASL clients we 
require them   
134   // to be the same so that the authentication state is identical for 
communication in 
135   // both directions. In order to do this, we share the same TTransport 
object for both
136   // input and output set in TAcceptQueueServer::SetupConnection.   
   
137   std::shared_ptr ret_transport;
   
138   std::shared_ptr wrapped(  
   
139   new TSaslServerTransport(serverDefinitionMap_, trans));   
   
140   // Set socket timeouts to prevent TSaslServerTransport->open from 
blocking the server
141   // from accepting new connections if a read/write blocks during the 
handshake
142   TSocket* socket = static_cast(trans.get()); 
   
143   socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms);<== 
5min timeout for read() calls invoked indirectly at line 147: open()
   
144   socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms);
   
145   ret_transport.reset(new TBufferedTransport(wrapped,   
   
146 
impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES));
   
147   ret_transport.get()->open();  
   
148   // Reset socket timeout back to zero, so idle clients do not timeout  
   
149   socket->setRecvTimeout(0);
   
150   socket->setSendTimeout(0);
   
151   return ret_transport; 
   
152 }

[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly

2022-10-20 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621353#comment-17621353
 ] 

Qifan Chen commented on IMPALA-11653:
-

Before the very first communication with the client, impalad needs to setup the 
connection by calling SetupConnection(). It is in this setup method that the 
above "infinite wait" can happen. 


{code:java}
(gdb) bt
#0  0x7f3b972ee20d in poll () from ./lib64/libc.so.6
#1  0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned 
char*, unsigned int) ()
#2  0x02dd1803 in unsigned int 
apache::thrift::transport::readAll(apache::thrift::transport::TSocket&,
 unsigned char*, unsigned int) ()
#3  0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", 
this=) at 
../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121
#4  apache::thrift::transport::TSaslTransport::receiveSaslMessage 
(this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, 
length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259
#5  0x0132db14 in 
apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage 
(this=0x278a96b0) at TSaslServerTransport.cpp:95
#6  0x01330e33 in 
apache::thrift::transport::TSaslTransport::doSaslNegotiation (this=0x278a96b0) 
at TSaslTransport.cpp:81
#7  0x0132e723 in open (this=0x12e29750) at 
../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218.
   void open() { transport_->open(); }

132   void TSaslTransport::open() { 
   
133 // Only client should open the underlying transport.
   
134 if (isClient_ && !transport_->isOpen()) {   
   
135   transport_->open();   
   
136 }   
   
137 
   
138 // Start the SASL negotiation protocol. 
   
139 doSaslNegotiation();
   
140   } 

#8  apache::thrift::transport::TSaslServerTransport::Factory::getTransport 
(this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173
#9  0x010cd49d in 
apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, 
entry=...) at TAcceptQueueServer.cpp:233
#10 0x010cef4d in operator() 
{code}




{code:java}
130 std::shared_ptr TSaslServerTransport::Factory::getTransport(
   
131 std::shared_ptr trans) {
   
132   // Thrift servers use both an input and an output transport to 
communicate with  
133   // clients. In principal, these can be different, but for SASL clients we 
require them   
134   // to be the same so that the authentication state is identical for 
communication in 
135   // both directions. In order to do this, we share the same TTransport 
object for both
136   // input and output set in TAcceptQueueServer::SetupConnection.   
   
137   std::shared_ptr ret_transport;
   
138   std::shared_ptr wrapped(  
   
139   new TSaslServerTransport(serverDefinitionMap_, trans));   
   
140   // Set socket timeouts to prevent TSaslServerTransport->open from 
blocking the server
141   // from accepting new connections if a read/write blocks during the 
handshake
142   TSocket* socket = static_cast(trans.get()); 
   
143   socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms);<== 
5min timeout for read() calls invoked indirectly at line 147: open()
   
144   socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms);
   
145   ret_transport.reset(new TBufferedTransport(wrapped,   
   
146 
impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES));
   
147   ret_transport.get()->open();  
   
148   // Reset socket timeout back to zero, so idle clients do not timeout  
   
149   socket->setRecvTimeout(0);
   
150   socket->setSendTimeout(0);
   
151   return ret_transport; 
   
152 }   
   
153

[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly

2022-10-20 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621349#comment-17621349
 ] 

Qifan Chen commented on IMPALA-11653:
-

Here is the client side code where the existence of Kerberos is picked up. 

{code:java}
967 self.close_connection() 
   
 968 self.imp_client = self._new_impala_client()

 969 self._connect()

 970 # If the connection fails and the Kerberos has not been enabled,   

 971 # check for a valid kerberos ticket and retry the connection   

 972 # with kerberos enabled.   

 973 if not self.imp_client.connected and not self.use_kerberos:

 974   try: 

 975 if call(["klist", "-s"]) == 0: 

 976   print("Kerberos ticket found in the credentials cache, retrying 
"
 977 "the connection with a secure transport.", 
file=sys.stderr)
 978   self.use_kerberos = True 

 979   self.use_ldap = False

 980   self.ldap_password = None

 981   self.imp_client = self._new_impala_client()  

 982   self._connect()  

 983   except OSError:  

 984 pass   

shell/impala_shell.py 
{code}


> Identify and time out connections that are not from a supported Impala client 
> more eagerly
> --
>
> Key: IMPALA-11653
> URL: https://issues.apache.org/jira/browse/IMPALA-11653
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.1.0
>Reporter: Vincent Tran
>Assignee: Qifan Chen
>Priority: Major
> Attachments: simple_tcp_client.py
>
>
> When a tcp client opens a connection to an Impala client interface (hs2 or 
> beeswax), the connection is accepted immediately after the 3-way handshake 
> (SYN, SYN-ACK, ACK) and is queued for 
> *TAcceptQueueServer::SetupConnection()*.  However, if the client sends 
> nothing else, the ImpalaServer will block in 
> *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN 
> or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 
> minutes).
> The connection setup thread stack trace can be observed below during this 
> period.
> {noformat}
> (gdb) bt
> #0  0x7f3b972ee20d in poll () from ./lib64/libc.so.6
> #1  0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned 
> char*, unsigned int) ()
> #2  0x02dd1803 in unsigned int 
> apache::thrift::transport::readAll(apache::thrift::transport::TSocket&,
>  unsigned char*, unsigned int) ()
> #3  0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", 
> this=) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121
> #4  apache::thrift::transport::TSaslTransport::receiveSaslMessage 
> (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, 
> length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259
> #5  0x0132db14 in 
> apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage 
> (this=0x278a96b0) at TSaslServerTransport.cpp:95
> #6  0x01330e33 in 
> apache::thrift::transport::TSaslTransport::doSaslNegotiation 
> (this=0x278a96b0) at TSaslTransport.cpp:81
> #7  0x0132e723 in open (this=0x12e29750) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218
> #8  apache::thrift::transport::TSaslServerTransport::Factory::getTransport 
> (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173
> #9  0x010cd49d in 
> apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, 
> entry=...) at TAcceptQueueServer.cpp:233
> #10 0x010cef4d in operator() (tid=, item=..., 
> __closure=) at TAcceptQueueServer.cpp:323
> #11 
> boost::detail::function::void_function_obj_invoker2  const boost::shared_ptr&)>, void, 
> int, const 
>

[jira] [Commented] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly

2022-10-20 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621347#comment-17621347
 ] 

Qifan Chen commented on IMPALA-11653:
-

It seems the bug in question takes place in the following method. The code at 
line 143 sets a timeout of 5min for the open() at line 147.

If the timeout is set to say 1min instead, then a legitimate client with slow 
SASL handshaking would be kicked out prematurely.


{code:java}
130 std::shared_ptr TSaslServerTransport::factory::getTransport(
   
131 std::shared_ptr trans) {
   
132   // Thrift servers use both an input and an output transport to 
communicate with  
133   // clients. In principal, these can be different, but for SASL clients we 
require them   
134   // to be the same so that the authentication state is identical for 
communication in 
135   // both directions. In order to do this, we share the same TTransport 
object for both
136   // input and output set in TAcceptQueueServer::SetupConnection.   
   
137   std::shared_ptr ret_transport;
   
138   std::shared_ptr wrapped(  
   
139   new TSaslServerTransport(serverDefinitionMap_, trans));   
   
140   // Set socket timeouts to prevent TSaslServerTransport->open from 
blocking the server
141   // from accepting new connections if a read/write blocks during the 
handshake
142   TSocket* socket = static_cast(trans.get()); 
   
143   socket->setRecvTimeout(FLAGS_sasl_connect_tcp_timeout_ms);
   
144   socket->setSendTimeout(FLAGS_sasl_connect_tcp_timeout_ms);
   
145   ret_transport.reset(new TBufferedTransport(wrapped,   
   
146 
impala::ThriftServer::BufferedTransportFactory::DEFAULT_BUFFER_SIZE_BYTES));
   
147   ret_transport.get()->open();  
   
148   // Reset socket timeout back to zero, so idle clients do not timeout  
   
149   socket->setRecvTimeout(0);
   
150   socket->setSendTimeout(0);
   
151   return ret_transport; 
   
152 } 
{code}


> Identify and time out connections that are not from a supported Impala client 
> more eagerly
> --
>
> Key: IMPALA-11653
> URL: https://issues.apache.org/jira/browse/IMPALA-11653
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.1.0
>Reporter: Vincent Tran
>Assignee: Qifan Chen
>Priority: Major
> Attachments: simple_tcp_client.py
>
>
> When a tcp client opens a connection to an Impala client interface (hs2 or 
> beeswax), the connection is accepted immediately after the 3-way handshake 
> (SYN, SYN-ACK, ACK) and is queued for 
> *TAcceptQueueServer::SetupConnection()*.  However, if the client sends 
> nothing else, the ImpalaServer will block in 
> *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN 
> or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 
> minutes).
> The connection setup thread stack trace can be observed below during this 
> period.
> {noformat}
> (gdb) bt
> #0  0x7f3b972ee20d in poll () from ./lib64/libc.so.6
> #1  0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned 
> char*, unsigned int) ()
> #2  0x02dd1803 in unsigned int 
> apache::thrift::transport::readAll(apache::thrift::transport::TSocket&,
>  unsigned char*, unsigned int) ()
> #3  0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", 
> this=) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121
> #4  apache::thrift::transport::TSaslTransport::receiveSaslMessage 
> (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, 
> length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259
> #5  0x0132db14 in 
> apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage 
> (this=0x278a96b0) at TSaslServerTransport.cpp:95
> #6  0x01330e33 in 
> apache::thrift::transport::TSaslTransport::doSaslNegotiation 
> (this=0x278a96b0) at TSaslTransport.cpp:81
> #7  0x0132e723 in open (this=0x12e29750) at 
> ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218
> #8

[jira] [Comment Edited] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type

2022-10-19 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584
 ] 

Qifan Chen edited comment on IMPALA-11665 at 10/19/22 8:39 PM:
---

Setup a table with nulls and empty strings in the STRING column null_str. When 
loading the table, configured the table with 1 page and 3 pages. 

Ran the query in DML section below and observed the following when the fast 
code path was taken.
1. Nulls are not part of the page min/max stats and min/max filter stats at 
all, which is good;
2. The runtime filtering works as designed. 

DDL


{code:java}
create table null_pq (
id string, 
null_str string,
null_int int
) 
sort by (null_str) 
stored as parquet
;
{code}


data loading:


{code:java}
set PARQUET_PAGE_ROW_COUNT_LIMIT=12;
insert into null_pq values
('a', null, 1),
('b', null, 2),
('c',null,3),
('aa', 'a', 1),
('ab', 'b', 2),
('ac','c',3),
('ad', '', 4),
('ae', '', 5),
('ac','',6);


{code}

1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12)



{code:java}
[14:11:06 qchen@qifan-10229: src] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq
22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY
null_str:  BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]...
null_int:  INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

BINARY id 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:ad
value 2: R:0 D:1 V:ae
value 3: R:0 D:1 V:ac
value 4: R:0 D:1 V:aa
value 5: R:0 D:1 V:ab
value 6: R:0 D:1 V:ac
value 7: R:0 D:1 V:a
value 8: R:0 D:1 V:b
value 9: R:0 D:1 V:c

BINARY null_str 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:
value 2: R:0 D:1 V:
value 3: R:0 D:1 V:
value 4: R:0 D:1 V:a
value 5: R:0 D:1 V:b
value 6: R:0 D:1 V:c
value 7: R:0 D:0 V:
value 8: R:0 D:0 V:
value 9: R:0 D:0 V:

INT32 null_int 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:4
value 2: R:0 D:1 V:5
value 3: R:0 D:1 V:6
value 4: R:0 D:1 V:1
value 5: R:0 D:1 V:2
value 6: R:0 D:1 V:3
value 7: R:0 D:1 V:1
value 8: R:0 D:1 V:2
value 9: R:0 D:1 V:3
[14:23:16 qchen@qifan-10229: src] 
{code}




3 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4)


{code:java}
pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq

[13:50:22 qchen@qifan-10229: cluster] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq
22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]...
null_str:  BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]...
null_int:  INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:1

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat 
[more]... VC:1

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

[jira] [Comment Edited] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type

2022-10-19 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584
 ] 

Qifan Chen edited comment on IMPALA-11665 at 10/19/22 8:39 PM:
---

Setup a table with nulls and empty strings in the STRING column null_str. When 
loading the table, configured the table with 1 page and 3 pages. 

Ran the query in DML section below and observed the following when the fast 
code path was taken.
1. Nulls are not part of the page min/max stats and min/max filter stats at 
all, which is good;
2. The runtime filtering works as designed. 

DDL


{code:java}
create table null_pq (
id string, 
null_str string,
null_int int
) 
sort by (null_str) 
stored as parquet
;
{code}


data loading:


{code:java}
set PARQUET_PAGE_ROW_COUNT_LIMIT=12;
insert into null_pq values
('a', null, 1),
('b', null, 2),
('c',null,3),
('aa', 'a', 1),
('ab', 'b', 2),
('ac','c',3),
('ad', '', 4),
('ae', '', 5),
('ac','',6);


{code}

1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12)



{code:java}
[14:11:06 qchen@qifan-10229: src] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq
22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY
null_str:  BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]...
null_int:  INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

BINARY id 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:ad
value 2: R:0 D:1 V:ae
value 3: R:0 D:1 V:ac
value 4: R:0 D:1 V:aa
value 5: R:0 D:1 V:ab
value 6: R:0 D:1 V:ac
value 7: R:0 D:1 V:a
value 8: R:0 D:1 V:b
value 9: R:0 D:1 V:c

BINARY null_str 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:
value 2: R:0 D:1 V:
value 3: R:0 D:1 V:
value 4: R:0 D:1 V:a
value 5: R:0 D:1 V:b
value 6: R:0 D:1 V:c
value 7: R:0 D:0 V:
value 8: R:0 D:0 V:
value 9: R:0 D:0 V:

INT32 null_int 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:4
value 2: R:0 D:1 V:5
value 3: R:0 D:1 V:6
value 4: R:0 D:1 V:1
value 5: R:0 D:1 V:2
value 6: R:0 D:1 V:3
value 7: R:0 D:1 V:1
value 8: R:0 D:1 V:2
value 9: R:0 D:1 V:3
[14:23:16 qchen@qifan-10229: src] 
{code}




3 pages case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4)


{code:java}
pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq

[13:50:22 qchen@qifan-10229: cluster] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq
22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]...
null_str:  BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]...
null_int:  INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:1

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat 
[more]... VC:1

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

[jira] [Commented] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type

2022-10-19 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620587#comment-17620587
 ] 

Qifan Chen commented on IMPALA-11665:
-

It may be helpful to obtain the parquet data file(s) involved in the crash and 
to try the offending query afterwards. 

> Min/Max filter could crash in fast code path for string data type
> -
>
> Key: IMPALA-11665
> URL: https://issues.apache.org/jira/browse/IMPALA-11665
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Abhishek Rawat
>Assignee: Qifan Chen
>Priority: Critical
>
> The impalad logs show that memcmp failed due to a segfault:
> {code:java}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f0396c3ff22, pid=1, tid=0x7f023f365700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_332-b09) (build 1.8.0_332-b09)
> # Java VM: OpenJDK 64-Bit Server VM (25.332-b09 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # C  [libc.so.6+0x16af22]  __memcmp_sse4_1+0xd42 {code}
> Resolved Stack Trace for the crashed thread:
> {code:java}
> Thread 530 (crashed)
>  0  libc-2.17.so + 0x16af22
>     rax = 0x7f61567715f0   rdx = 0x000a
>     rcx = 0x7f62ae04cf22   rbx = 0x
>     rsi = 0x5d1e900a   rdi = 0x000a
>     rbp = 0x7f6156771560   rsp = 0x7f6156771548
>      r8 = 0x034d40f0    r9 = 0x7f62ae022e90
>     r10 = 0x0498ff6c   r11 = 0x7f62ae06f590
>     r12 = 0x000a   r13 = 0x1a9678e8
>     r14 = 0x7f6156771730   r15 = 0x01b1f380
>     rip = 0x7f62ae04cf22
>     Found by: given as instruction pointer in context
>  1  
> impalad!impala::HdfsParquetScanner::CollectSkippedPageRangesForSortedColumn(impala::MinMaxFilter
>  const*, impala::ColumnType const&, 
> std::vector, 
> std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, 
> std::vector, 
> std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, int, int, 
> std::vector >*) 
> [hdfs-parquet-scanner.cc : 1388 + 0x3]
>     rbp = 0x7f6156771650   rsp = 0x7f6156771570
>     rip = 0x01b10305
>     Found by: previous frame's frame pointer
>  2  impalad!impala::HdfsParquetScanner::SkipPagesBatch(parquet::RowGroup&, 
> impala::ColumnStatsReader const&, parquet::ColumnIndex const&, int, int, 
> impala::ColumnType const&, int, parquet::ColumnChunk const&, 
> impala::MinMaxFilter const*, std::vector std::allocator >*, int*) [hdfs-parquet-scanner.cc : 1230 + 
> 0x34]
>     rbx = 0x7f61567716f0   rbp = 0x7f61567717e0
>     rsp = 0x7f6156771660   r12 = 0x7f6156771710
>     r13 = 0x7f6156771950   r14 = 0x1a9678e8
>     r15 = 0x7f6156771920   rip = 0x01b14838
>     Found by: call frame info
>  3  
> impalad!impala::HdfsParquetScanner::FindSkipRangesForPagesWithMinMaxFilters(std::vector  std::allocator >*) [hdfs-parquet-scanner.cc : 1528 + 0x57]
>     rbx = 0x004a   rbp = 0x7f6156771b10
>     rsp = 0x7f61567717f0   r12 = 0x2c195800
>     r13 = 0x2aa115d0   r14 = 0x0001
>     r15 = 0x0049   rip = 0x01b1cf1a
>     Found by: call frame info
>  4  impalad!impala::HdfsParquetScanner::EvaluatePageIndex() 
> [hdfs-parquet-scanner.cc : 1600 + 0x19]
>     rbx = 0x7f6156771c30   rbp = 0x7f6156771cf0
>     rsp = 0x7f6156771b20   r12 = 0x2c195800
>     r13 = 0x7f6156771de8   r14 = 0x104528a0
>     r15 = 0x7f6156771df0   rip = 0x01b1d9dd
>     Found by: call frame info
>  5  impalad!impala::HdfsParquetScanner::ProcessPageIndex() 
> [hdfs-parquet-scanner.cc : 1318 + 0xb]
>     rbx = 0x2c195800   rbp = 0x7f6156771d70
>     rsp = 0x7f6156771d00   r12 = 0x7f6156771d10
>     r13 = 0x7f6156771de8   r14 = 0x104528a0
>     r15 = 0x7f6156771df0   rip = 0x01b1dd0b
>     Found by: call frame info
>  6  impalad!impala::HdfsParquetScanner::NextRowGroup() 
> [hdfs-parquet-scanner.cc : 934 + 0xf]
>     rbx = 0x318ce040   rbp = 0x7f6156771e40
>     rsp = 0x7f6156771d80   r12 = 0x2c195800
>     r13 = 0x7f6156771de8   r14 = 0x104528a0
>     r15 = 0x7f6156771df0   rip = 0x01b1e1b4
>     Found by: call frame info
>  7  impalad!impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*) 
> [hdfs-parquet-scanner.cc : 504 + 0xb]
>     rbx = 0x2c195800   rbp = 0x7f6156771ec0
>     rsp = 0x7f6156771e50   r12 = 0xc1ca4d00
>     r13 = 0x7f6156771e78   r14 = 0x7f6156771e80
>     r15 = 0xaaab   rip = 0x01b1ed5b
>     Found by: call frame info
>  8

[jira] [Commented] (IMPALA-11665) Min/Max filter could crash in fast code path for string data type

2022-10-19 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620584#comment-17620584
 ] 

Qifan Chen commented on IMPALA-11665:
-

Setup a table with nulls and empty strings in the STRING columns. When loading, 
configured the table with 1 page and 3 pages. 

Ran the query in DML section below and observed the following when the fast 
code path is taken.
1. Nulls are not part of the page min/max stats and min/max filter stats at 
all, which is good;
2. The runtime filtering works as designed. 

DDL


{code:java}
create table null_pq (
id string, 
null_str string,
null_int int
) 
sort by (null_str) 
stored as parquet
;
{code}


data loading:


{code:java}
set PARQUET_PAGE_ROW_COUNT_LIMIT=12;
insert into null_pq values
('a', null, 1),
('b', null, 2),
('c',null,3),
('aa', 'a', 1),
('ab', 'b', 2),
('ac','c',3),
('ad', '', 4),
('ae', '', 5),
('ac','',6);


{code}

1 page case (set PARQUET_PAGE_ROW_COUNT_LIMIT=12)



{code:java}
[14:11:06 qchen@qifan-10229: src] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/9341bc3df646c530-9701c2fc_162963959_data.0.parq
22/10/17 14:23:15 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:85/89/1.05 VC:9 ENC:RLE,PLAIN_DICTIONARY
null_str:  BINARY SNAPPY DO:146 FPO:180 SZ:64/60/0.94 VC:9 ENC:RLE,PLA [more]...
null_int:  INT32 SNAPPY DO:273 FPO:312 SZ:72/68/0.94 VC:9 ENC:RLE,PLAI [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:9

BINARY id 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:ad
value 2: R:0 D:1 V:ae
value 3: R:0 D:1 V:ac
value 4: R:0 D:1 V:aa
value 5: R:0 D:1 V:ab
value 6: R:0 D:1 V:ac
value 7: R:0 D:1 V:a
value 8: R:0 D:1 V:b
value 9: R:0 D:1 V:c

BINARY null_str 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:
value 2: R:0 D:1 V:
value 3: R:0 D:1 V:
value 4: R:0 D:1 V:a
value 5: R:0 D:1 V:b
value 6: R:0 D:1 V:c
value 7: R:0 D:0 V:
value 8: R:0 D:0 V:
value 9: R:0 D:0 V:

INT32 null_int 

*** row group 1 of 1, values 1 to 9 *** 
value 1: R:0 D:1 V:4
value 2: R:0 D:1 V:5
value 3: R:0 D:1 V:6
value 4: R:0 D:1 V:1
value 5: R:0 D:1 V:2
value 6: R:0 D:1 V:3
value 7: R:0 D:1 V:1
value 8: R:0 D:1 V:2
value 9: R:0 D:1 V:3
[14:23:16 qchen@qifan-10229: src] 
{code}




3 pages case (set PARQUET_PAGE_ROW_COUNT_LIMIT=4)


{code:java}
pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq

[13:50:22 qchen@qifan-10229: cluster] pqtools dump 
hdfs://localhost:20500/test-warehouse/null_pq/aa449f944bb9d005-7df200e3_811956887_data.0.parq
22/10/17 13:51:02 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
row group 0 

id:BINARY SNAPPY DO:4 FPO:56 SZ:139/139/1.00 VC:9 ENC:RLE,PLAI [more]...
null_str:  BINARY SNAPPY DO:200 FPO:234 SZ:116/108/0.93 VC:9 ENC:RLE,P [more]...
null_int:  INT32 SNAPPY DO:388 FPO:427 SZ:126/118/0.94 VC:9 ENC:RLE,PL [more]...

id TV=9 RL=0 DL=1 DS:   8 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:1

null_str TV=9 RL=0 DL=1 DS: 4 DE:PLAIN_DICTIONARY

page 0:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 1:  DLE:RLE RLE:RLE VLE:PLAIN_DICTIONARY  
[more]... VC:4
page 2:  DLE:RLE RLE:RLE VLE:PLAIN ST:[no stat 
[more]... VC:1

null_int TV=9 RL=0 DL=1 DS: 6 DE:PLAIN_DICTIONARY

page

[jira] [Closed] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan

2022-10-13 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-10758.
---
Resolution: Not A Bug

Verified that the plan difference does not show up in recent core s3 tests.  
The test passes. 

> S3PlannerTest.testNestedCollections fails because of mismatch plan
> --
>
> Key: IMPALA-10758
> URL: https://issues.apache.org/jira/browse/IMPALA-10758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Qifan Chen
>Priority: Critical
>
> S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with 
> following error:
> {noformat}
> Error Message
> Section PLAN of query:
> select 1
> from tpch_nested_parquet.region.r_nations t1
> inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos
> inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address
> left join t2.c_orders t4
> inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey
> left join t4.item.o_lineitems t6 on t6.item.l_returnflag = 
> t4.item.o_orderpriority
> Actual does not match expected result:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey
> 
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN S3 [tpch_nested_parquet.region t5]
> |  |  | S3 partitions=1/1 files=1 size=3.59KB
> |  |  | row-size=2B cardinality=5
> |  |  |
> |  |  01:SCAN S3 [tpch_nested_parquet.customer t2]
> |  | S3 partitions=1/1 files=4 size=289.06MB
> |  | runtime filters: RF004 -> t2.c_custkey
> |  | row-size=59B cardinality=150.00K
> |  |
> |  02:SCAN S3 [tpch_nested_parquet.region t3]
> | S3 partitions=1/1 files=1 size=3.59KB
> | runtime filters: RF002 -> t3.r_comment
> | row-size=78B cardinality=5
> |
> 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1]
>S3 partitions=1/1 files=1 size=3.59KB
>runtime filters: RF001 -> t1.pos, RF000 -> t1.pos
>row-size=8B cardinality=50
> Expected:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN HDFS [tpch_nested_parquet.region t5]
> |  |  | HDFS partitions=1/1 files=1 size=3.59KB
> |  |  | row-size=2B cardinality=5
> |  |  |
> |  |  01:SCAN HDFS [tpch_nested_parquet.customer t2]
> |  | HDFS partitions=1/1 files=4 size=289.02MB
> |  | runtime filters: RF004 -> t2.c_custkey
> |  | row-size=59B

[jira] [Closed] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan

2022-10-13 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-10758.
---
Resolution: Not A Bug

Verified that the plan difference does not show up in recent core s3 tests.  
The test passes. 

> S3PlannerTest.testNestedCollections fails because of mismatch plan
> --
>
> Key: IMPALA-10758
> URL: https://issues.apache.org/jira/browse/IMPALA-10758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Qifan Chen
>Priority: Critical
>
> S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with 
> following error:
> {noformat}
> Error Message
> Section PLAN of query:
> select 1
> from tpch_nested_parquet.region.r_nations t1
> inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos
> inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address
> left join t2.c_orders t4
> inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey
> left join t4.item.o_lineitems t6 on t6.item.l_returnflag = 
> t4.item.o_orderpriority
> Actual does not match expected result:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey
> 
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN S3 [tpch_nested_parquet.region t5]
> |  |  | S3 partitions=1/1 files=1 size=3.59KB
> |  |  | row-size=2B cardinality=5
> |  |  |
> |  |  01:SCAN S3 [tpch_nested_parquet.customer t2]
> |  | S3 partitions=1/1 files=4 size=289.06MB
> |  | runtime filters: RF004 -> t2.c_custkey
> |  | row-size=59B cardinality=150.00K
> |  |
> |  02:SCAN S3 [tpch_nested_parquet.region t3]
> | S3 partitions=1/1 files=1 size=3.59KB
> | runtime filters: RF002 -> t3.r_comment
> | row-size=78B cardinality=5
> |
> 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1]
>S3 partitions=1/1 files=1 size=3.59KB
>runtime filters: RF001 -> t1.pos, RF000 -> t1.pos
>row-size=8B cardinality=50
> Expected:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN HDFS [tpch_nested_parquet.region t5]
> |  |  | HDFS partitions=1/1 files=1 size=3.59KB
> |  |  | row-size=2B cardinality=5
> |  |  |
> |  |  01:SCAN HDFS [tpch_nested_parquet.customer t2]
> |  | HDFS partitions=1/1 files=4 size=289.02MB
> |  | runtime filters: RF004 -> t2.c_custkey
> |  | row-size=59B

[jira] [Commented] (IMPALA-10758) S3PlannerTest.testNestedCollections fails because of mismatch plan

2022-10-13 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617191#comment-17617191
 ] 

Qifan Chen commented on IMPALA-10758:
-

Verified that the bug does not exist in recent core s3 tests.

https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/226/testReport/org.apache.impala.planner/S3PlannerTest/testNestedCollections/

https://master-03.jenkins.cloudera.com/job/impala-cdwh-2022.0.10.1-core-s3/4/testReport/org.apache.impala.planner/S3PlannerTest/

> S3PlannerTest.testNestedCollections fails because of mismatch plan
> --
>
> Key: IMPALA-10758
> URL: https://issues.apache.org/jira/browse/IMPALA-10758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Qifan Chen
>Priority: Critical
>
> S3PlannerTest.testNestedCollections fails in impala-asf-master-core-s3 with 
> following error:
> {noformat}
> Error Message
> Section PLAN of query:
> select 1
> from tpch_nested_parquet.region.r_nations t1
> inner join tpch_nested_parquet.customer t2 on t2.c_nationkey = t1.pos
> inner join tpch_nested_parquet.region t3 on t3.r_comment = t2.c_address
> left join t2.c_orders t4
> inner join tpch_nested_parquet.region t5 on t5.r_regionkey = t2.c_custkey
> left join t4.item.o_lineitems t6 on t6.item.l_returnflag = 
> t4.item.o_orderpriority
> Actual does not match expected result:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey, RF001 <- t2.c_nationkey
> 
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN S3 [tpch_nested_parquet.region t5]
> |  |  | S3 partitions=1/1 files=1 size=3.59KB
> |  |  | row-size=2B cardinality=5
> |  |  |
> |  |  01:SCAN S3 [tpch_nested_parquet.customer t2]
> |  | S3 partitions=1/1 files=4 size=289.06MB
> |  | runtime filters: RF004 -> t2.c_custkey
> |  | row-size=59B cardinality=150.00K
> |  |
> |  02:SCAN S3 [tpch_nested_parquet.region t3]
> | S3 partitions=1/1 files=1 size=3.59KB
> | runtime filters: RF002 -> t3.r_comment
> | row-size=78B cardinality=5
> |
> 00:SCAN S3 [tpch_nested_parquet.region.r_nations t1]
>S3 partitions=1/1 files=1 size=3.59KB
>runtime filters: RF001 -> t1.pos, RF000 -> t1.pos
>row-size=8B cardinality=50
> Expected:
> PLAN-ROOT SINK
> |
> 14:SUBPLAN
> |  row-size=183B cardinality=1
> |
> |--12:SUBPLAN
> |  |  row-size=183B cardinality=1
> |  |
> |  |--10:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  |  join predicates: t6.item.l_returnflag = t4.item.o_orderpriority
> |  |  |  row-size=183B cardinality=10
> |  |  |
> |  |  |--08:SINGULAR ROW SRC
> |  |  | row-size=171B cardinality=1
> |  |  |
> |  |  09:UNNEST [t4.item.o_lineitems t6]
> |  | row-size=0B cardinality=10
> |  |
> |  11:NESTED LOOP JOIN [RIGHT OUTER JOIN]
> |  |  row-size=171B cardinality=1
> |  |
> |  |--06:SINGULAR ROW SRC
> |  | row-size=147B cardinality=1
> |  |
> |  07:UNNEST [t2.c_orders t4]
> | row-size=0B cardinality=10
> |
> 13:HASH JOIN [INNER JOIN]
> |  hash predicates: t1.pos = t2.c_nationkey
> |  runtime filters: RF000 <- t2.c_nationkey
> |  row-size=147B cardinality=1
> |
> |--05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: t3.r_comment = t2.c_address
> |  |  runtime filters: RF002 <- t2.c_address
> |  |  row-size=139B cardinality=1
> |  |
> |  |--04:HASH JOIN [INNER JOIN]
> |  |  |  hash predicates: t2.c_custkey = t5.r_regionkey
> |  |  |  runtime filters: RF004 <- t5.r_regionkey
> |  |  |  row-size=61B cardinality=5
> |  |  |
> |  |  |--03:SCAN HDFS [tpch_nested_parquet.region t5]
> |  |  | HDFS partitions=1/1 files=1

[jira] [Closed] (IMPALA-10292) Improvement to test_misaligned_parquet_row_groups section in query_test/test_scanners.py

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-10292.
---
Resolution: Won't Fix

The "Won't Fix" is based on Joe's comment. 

> Improvement to test_misaligned_parquet_row_groups section in 
> query_test/test_scanners.py
> 
>
> Key: IMPALA-10292
> URL: https://issues.apache.org/jira/browse/IMPALA-10292
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build, flaky
>
> In /impala-asf-master-exhaustive build, the following error is seen. 
>  Error Details
> {code:java}
> query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups 
> self._misaligned_parquet_row_groups_helper(table_name, 7300) 
> query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper 
> assert len(num_scanners_with_no_reads_list) == 4 E   assert 3 == 4 E+  
> where 3 = len(['0', '0', '0'])
> {code}
>  Stack Trace
> {code:java}
> query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups
> self._misaligned_parquet_row_groups_helper(table_name, 7300)
> query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper
> assert len(num_scanners_with_no_reads_list) == 4
> E   assert 3 == 4
> E+  where 3 = len(['0', '0', '0'])
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-10292) Improvement to test_misaligned_parquet_row_groups section in query_test/test_scanners.py

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-10292.
---
Resolution: Won't Fix

The "Won't Fix" is based on Joe's comment. 

> Improvement to test_misaligned_parquet_row_groups section in 
> query_test/test_scanners.py
> 
>
> Key: IMPALA-10292
> URL: https://issues.apache.org/jira/browse/IMPALA-10292
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build, flaky
>
> In /impala-asf-master-exhaustive build, the following error is seen. 
>  Error Details
> {code:java}
> query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups 
> self._misaligned_parquet_row_groups_helper(table_name, 7300) 
> query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper 
> assert len(num_scanners_with_no_reads_list) == 4 E   assert 3 == 4 E+  
> where 3 = len(['0', '0', '0'])
> {code}
>  Stack Trace
> {code:java}
> query_test/test_scanners.py:603: in test_misaligned_parquet_row_groups
> self._misaligned_parquet_row_groups_helper(table_name, 7300)
> query_test/test_scanners.py:636: in _misaligned_parquet_row_groups_helper
> assert len(num_scanners_with_no_reads_list) == 4
> E   assert 3 == 4
> E+  where 3 = len(['0', '0', '0'])
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work started] (IMPALA-11604) Planner changes for CPU usage

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11604 started by Qifan Chen.
---
> Planner changes for CPU usage
> -
>
> Key: IMPALA-11604
> URL: https://issues.apache.org/jira/browse/IMPALA-11604
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> Plan scaling based on estimated peak memory has been enabled in 
> IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such 
> as the number of data processed) as a scaling factor. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10999) Flakiness in TestAsyncLoadData.test_async_load

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10999.
-
Resolution: Fixed

> Flakiness in TestAsyncLoadData.test_async_load
> --
>
> Key: IMPALA-10999
> URL: https://issues.apache.org/jira/browse/IMPALA-10999
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Bikramjeet Vig
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build, flaky-test
>
> This test failed in one of the GVO's recently. 
> [Link|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]
>  
> {noformat}
> Error Message
> metadata/test_load.py:197: in test_async_load assert(exec_end_state == 
> finished_state) E   assert 3 == 4
> Stacktrace
> metadata/test_load.py:197: in test_async_load
> assert(exec_end_state == finished_state)
> E   assert 3 == 4
> Standard Error
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE;
> -- 2021-10-30 01:38:55,281 INFO MainThread: Started query 
> df43a0ff6165a9eb:33b0d69f
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_async_load_ff1c20a7`;
> -- 2021-10-30 01:39:01,148 INFO MainThread: Started query 
> e64bd28a97339b44:e76523a8
> -- 2021-10-30 01:39:01,253 INFO MainThread: Created database 
> "test_async_load_ff1c20a7" for test ID 
> "metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:
>  False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none]"
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- connecting to: localhost:21000
> -- executing against localhost:21000
> create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like 
> functional.alltypesnopart location 
> '/test-warehouse/test_load_staging_beeswax_False';
> -- 2021-10-30 01:39:09,435 INFO MainThread: Started query 
> e543635533874c9e:fe238ca9
> -- executing against localhost:21000
> select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False;
> -- 2021-10-30 01:39:13,178 INFO MainThread: Started query 
> 5c4969e81b1b614b:26754a22
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- executing against localhost:21000
> use functional;
> -- 2021-10-30 01:39:13,413 INFO MainThread: Started query 
> d340e3650cba2d6f:a35a14bb
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET batch_size=0;
>

[jira] [Resolved] (IMPALA-10999) Flakiness in TestAsyncLoadData.test_async_load

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10999.
-
Resolution: Fixed

> Flakiness in TestAsyncLoadData.test_async_load
> --
>
> Key: IMPALA-10999
> URL: https://issues.apache.org/jira/browse/IMPALA-10999
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Bikramjeet Vig
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build, flaky-test
>
> This test failed in one of the GVO's recently. 
> [Link|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]
>  
> {noformat}
> Error Message
> metadata/test_load.py:197: in test_async_load assert(exec_end_state == 
> finished_state) E   assert 3 == 4
> Stacktrace
> metadata/test_load.py:197: in test_async_load
> assert(exec_end_state == finished_state)
> E   assert 3 == 4
> Standard Error
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE;
> -- 2021-10-30 01:38:55,281 INFO MainThread: Started query 
> df43a0ff6165a9eb:33b0d69f
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_async_load_ff1c20a7`;
> -- 2021-10-30 01:39:01,148 INFO MainThread: Started query 
> e64bd28a97339b44:e76523a8
> -- 2021-10-30 01:39:01,253 INFO MainThread: Created database 
> "test_async_load_ff1c20a7" for test ID 
> "metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:
>  False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none]"
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- connecting to: localhost:21000
> -- executing against localhost:21000
> create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like 
> functional.alltypesnopart location 
> '/test-warehouse/test_load_staging_beeswax_False';
> -- 2021-10-30 01:39:09,435 INFO MainThread: Started query 
> e543635533874c9e:fe238ca9
> -- executing against localhost:21000
> select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False;
> -- 2021-10-30 01:39:13,178 INFO MainThread: Started query 
> 5c4969e81b1b614b:26754a22
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> -- executing against localhost:21000
> use functional;
> -- 2021-10-30 01:39:13,413 INFO MainThread: Started query 
> d340e3650cba2d6f:a35a14bb
> SET 
> client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
> SET batch_size=0;
>

[jira] [Resolved] (IMPALA-11573) Certain methods used by the replan feature can be improved

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11573.
-
Fix Version/s: Impala 4.1.1
   Resolution: Fixed

>  Certain methods used by the replan feature can be improved
> ---
>
> Key: IMPALA-11573
> URL: https://issues.apache.org/jira/browse/IMPALA-11573
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.1.1
>
>
> Certain methods for replanning (IMPALA-10992) are not suitable to be called 
> from Hive.  For example setupThresholdsForExecutorGroupSets() and 
> canStmtBeAutoScaled() in Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-11573) Certain methods used by the replan feature can be improved

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11573.
-
Fix Version/s: Impala 4.1.1
   Resolution: Fixed

>  Certain methods used by the replan feature can be improved
> ---
>
> Key: IMPALA-11573
> URL: https://issues.apache.org/jira/browse/IMPALA-11573
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.1.1
>
>
> Certain methods for replanning (IMPALA-10992) are not suitable to be called 
> from Hive.  For example setupThresholdsForExecutorGroupSets() and 
> canStmtBeAutoScaled() in Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10715.
-
Resolution: Fixed

Disabled the bloom filters for the entire decimal min/max filter tests. The 
tests are looking for the impact of min/max filters which should not be 
interfered by the blooms. 

> test_decimal_min_max_filters failed in exhaustive run
> -
>
> Key: IMPALA-10715
> URL: https://issues.apache.org/jira/browse/IMPALA-10715
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build
>
> test_decimal_min_max_filters failed in exhaustive run
> *Stack Trace*
> {noformat}
> query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
> common/impala_test_suite.py:775: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:653: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over ProbeRows did not match expected 
> results.
> E   EXPECTED VALUE:
> E   102
> E   
> E   
> E   ACTUAL VALUE:
> E   38
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10715.
-
Resolution: Fixed

Disabled the bloom filters for the entire decimal min/max filter tests. The 
tests are looking for the impact of min/max filters which should not be 
interfered by the blooms. 

> test_decimal_min_max_filters failed in exhaustive run
> -
>
> Key: IMPALA-10715
> URL: https://issues.apache.org/jira/browse/IMPALA-10715
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build
>
> test_decimal_min_max_filters failed in exhaustive run
> *Stack Trace*
> {noformat}
> query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
> common/impala_test_suite.py:775: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:653: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over ProbeRows did not match expected 
> results.
> E   EXPECTED VALUE:
> E   102
> E   
> E   
> E   ACTUAL VALUE:
> E   38
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work started] (IMPALA-10715) test_decimal_min_max_filters failed in exhaustive run

2022-10-12 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10715 started by Qifan Chen.
---
> test_decimal_min_max_filters failed in exhaustive run
> -
>
> Key: IMPALA-10715
> URL: https://issues.apache.org/jira/browse/IMPALA-10715
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build
>
> test_decimal_min_max_filters failed in exhaustive run
> *Stack Trace*
> {noformat}
> query_test/test_runtime_filters.py:223: in test_decimal_min_max_filters
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
> common/impala_test_suite.py:775: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:653: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over ProbeRows did not match expected 
> results.
> E   EXPECTED VALUE:
> E   102
> E   
> E   
> E   ACTUAL VALUE:
> E   38
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match the expected

2022-10-11 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen updated IMPALA-11652:

Summary: In org.apache.impala.planner.PlannerTest.testHbase, the selected 
range does not match the expected  (was: In 
org.apache.impala.planner.PlannerTest.testHbase, the selected range does not 
match with expected)

> In org.apache.impala.planner.PlannerTest.testHbase, the selected range does 
> not match the expected
> --
>
> Key: IMPALA-11652
> URL: https://issues.apache.org/jira/browse/IMPALA-11652
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Reporter: Qifan Chen
>Priority: Major
>
> org.apache.impala.planner.PlannerTest.testHbase
> Error Message
> {code:java}
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE 1:3
> 
>   HBASE KEYRANGE 3:5
>   HBASE KEYRANGE :1
> NODE 0:
> Expected:
>   HBASE KEYRANGE 3:5
>   HBASE KEYRANGE :3
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypesagg
> where bigint_col is not null and bool_col = true
> Actual does not match expected result:
>   HBASE KEYRANGE 1:3
> 
>   HBASE KEYRANGE 3:7
>   HBASE KEYRANGE 7:
>   HBASE KEYRANGE :1
> NODE 0:
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11652:
---

 Summary: In org.apache.impala.planner.PlannerTest.testHbase, the 
selected range does not match with expected
 Key: IMPALA-11652
 URL: https://issues.apache.org/jira/browse/IMPALA-11652
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


org.apache.impala.planner.PlannerTest.testHbase

Error Message

{code:java}
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id < '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 1:3

  HBASE KEYRANGE 3:5
  HBASE KEYRANGE :1
NODE 0:

Expected:
  HBASE KEYRANGE 3:5
  HBASE KEYRANGE :3
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.alltypesagg
where bigint_col is not null and bool_col = true
Actual does not match expected result:
  HBASE KEYRANGE 1:3

  HBASE KEYRANGE 3:7
  HBASE KEYRANGE 7:
  HBASE KEYRANGE :1
NODE 0:
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-11652) In org.apache.impala.planner.PlannerTest.testHbase, the selected range does not match with expected

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11652:
---

 Summary: In org.apache.impala.planner.PlannerTest.testHbase, the 
selected range does not match with expected
 Key: IMPALA-11652
 URL: https://issues.apache.org/jira/browse/IMPALA-11652
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


org.apache.impala.planner.PlannerTest.testHbase

Error Message

{code:java}
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id < '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 1:3

  HBASE KEYRANGE 3:5
  HBASE KEYRANGE :1
NODE 0:

Expected:
  HBASE KEYRANGE 3:5
  HBASE KEYRANGE :3
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.alltypesagg
where bigint_col is not null and bool_col = true
Actual does not match expected result:
  HBASE KEYRANGE 1:3

  HBASE KEYRANGE 3:7
  HBASE KEYRANGE 7:
  HBASE KEYRANGE :1
NODE 0:
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11651) Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11651:
---

 Summary: Number of files mismatch assertion in 
PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 Key: IMPALA-11651
 URL: https://issues.apache.org/jira/browse/IMPALA-11651
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


In impala-asf-master-core-s3-data-cache : #217 : 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none] (from pytest)


{code:java}
metadata/test_partition_metadata.py:214: in test_unsupported_text_compression
assert len(show_files_result.data) == 5, "Expected one file per partition 
dir"
E   AssertionError: Expected one file per partition dir
E   assert 2 == 5
E+  where 2 = 
len(['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1',
 
's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2'])
E+where 
['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1',
 
's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2']
 = .data

{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11651) Number of files mismatch assertion in PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11651:
---

 Summary: Number of files mismatch assertion in 
PartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 Key: IMPALA-11651
 URL: https://issues.apache.org/jira/browse/IMPALA-11651
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


In impala-asf-master-core-s3-data-cache : #217 : 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none] (from pytest)


{code:java}
metadata/test_partition_metadata.py:214: in test_unsupported_text_compression
assert len(show_files_result.data) == 5, "Expected one file per partition 
dir"
E   AssertionError: Expected one file per partition dir
E   assert 2 == 5
E+  where 2 = 
len(['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1',
 
's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2'])
E+where 
['s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2009/month=1/090101.txt\t19.95KB\tyear=2009/month=1',
 
's3a://impala-test-uswest2-2/test-warehouse/alltypes_text_gzip/year=2009/month=2/00_0.gz\t3.00KB\tyear=2009/month=2']
 = .data

{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-11650) Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11650:
---

 Summary: Missing Blacklisted Executors list in 
custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure
 Key: IMPALA-11650
 URL: https://issues.apache.org/jira/browse/IMPALA-11650
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


In impala-asf-master-core-s3-data-cache #217. : 
custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure 
(from pytest)



{code:java}
ustom_cluster/test_query_retries.py:276: in test_retry_exec_rpc_failure
self.__assert_executors_blacklisted(killed_impalad, retried_runtime_profile)
custom_cluster/test_query_retries.py:1091: in __assert_executors_blacklisted
assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname,

1088   def __assert_executors_blacklisted(self, blacklisted_impalad, profile):  

1089 """Validate that the given profile indicates that the given impalad 
was blacklisted
1090 during query execution.""" 

1091 assert "Blacklisted Executors: 
{0}:{1}".format(blacklisted_impalad.hostname,   
1092 blacklisted_impalad.service.krpc_port) in profile, profile   
{code}


This is the link to the test: 
https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3-data-cache/217/testReport/junit/custom_cluster.test_query_retries/TestQueryRetries/test_retry_exec_rpc_failure/




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11650) Missing Blacklisted Executors list in custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11650:
---

 Summary: Missing Blacklisted Executors list in 
custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure
 Key: IMPALA-11650
 URL: https://issues.apache.org/jira/browse/IMPALA-11650
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Qifan Chen


In impala-asf-master-core-s3-data-cache #217. : 
custom_cluster.test_query_retries.TestQueryRetries.test_retry_exec_rpc_failure 
(from pytest)



{code:java}
ustom_cluster/test_query_retries.py:276: in test_retry_exec_rpc_failure
self.__assert_executors_blacklisted(killed_impalad, retried_runtime_profile)
custom_cluster/test_query_retries.py:1091: in __assert_executors_blacklisted
assert "Blacklisted Executors: {0}:{1}".format(blacklisted_impalad.hostname,

1088   def __assert_executors_blacklisted(self, blacklisted_impalad, profile):  

1089 """Validate that the given profile indicates that the given impalad 
was blacklisted
1090 during query execution.""" 

1091 assert "Blacklisted Executors: 
{0}:{1}".format(blacklisted_impalad.hostname,   
1092 blacklisted_impalad.service.krpc_port) in profile, profile   
{code}


This is the link to the test: 
https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3-data-cache/217/testReport/junit/custom_cluster.test_query_retries/TestQueryRetries/test_retry_exec_rpc_failure/




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-11649) Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11649:
---

 Summary: Null pointer exception seen in 
org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 
testtaLoader
 Key: IMPALA-11649
 URL: https://issues.apache.org/jira/browse/IMPALA-11649
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/225/


Failed
generate_junitxml.buildall.create-load-data (from 
generate_junitxml.buildall.create-load-data)

Failing for the past 1 build (Since #225 )
Took 0 ms.
Error Message
Error in 
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/testdata/bin/create-load-data.sh
 at line 95: -timeout)


SQL


{code:java}
06:06:05 ERROR: INSERT into TABLE functional_kudu.alltypes
06:06:05 SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, 
float_col, double_col, date_string_col, string_col,
06:06:05timestamp_col, year, month
06:06:05 FROM functional.alltypes
{code}


SQL error:


{code:java}
06:06:05 ImpalaBeeswaxException: ImpalaBeeswaxException:
06:06:05  INNER EXCEPTION: 
06:06:05  MESSAGE: AnalysisException: Failed to load metadata for table: 
‘functional.alltypes'
06:06:05 CAUSED BY: TableLoadingException: Loading file and block metadata for 
24 paths for table functional.alltypes: failed to load 2 paths. Check the 
catalog server log for more details.
{code}


Catalog server log:


{code:java}
Log file created at: 2022/10/09 04:03:47
Running on machine: 
impala-ec2-centos79-m6i-4xlarge-ondemand-005a.vpc.cloudera.com
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E1009 04:03:47.665833 14343 logging.cc:248] stderr will be logged to this file.
22/10/09 04:03:48 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
22/10/09 04:03:48 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period 
at 10 second(s).
22/10/09 04:03:48 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
started
22/10/09 04:03:49 INFO Configuration.deprecation: No unit for 
fs.s3a.connection.request.timeout(0) assuming SECONDS
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics 
system...
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
stopped.
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
shutdown complete.
22/10/09 04:03:49 INFO util.JvmPauseMonitor: Starting JVM pause monitor
E1009 04:06:05.058591 19943 ParallelFileMetadataLoader.java:171] Loading file 
and block metadata for 24 paths for table functional.alltypes encountered an 
error loading data for path 
s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2010/month=4
Java exception follows:
{code}


{code:java}
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:168)
at 
org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120)
at 
org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:781)
at 
org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:744)
at 
org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:719)
at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1268)
at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1162)
at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144)
at 
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
at 
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.(Listing.java:621)
at 
org.apache.hadoop.fs.s3a.Listing.createObjectListingIterator(Listing.java:163)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:144)
at 
org.apache.hadoop.fs.s3a.Listing.getListFilesAssumingDir(Listing.java:212)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListFiles(S3AFileSystem.java:4790)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listFiles$37(S3AFileSystem.java:4732)
at

[jira] [Created] (IMPALA-11649) Null pointer exception seen in org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 testtaLoader

2022-10-11 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11649:
---

 Summary: Null pointer exception seen in 
org.apache.impala.catalog.ParallelFileMetada in impala-asf-master-core-s3 
testtaLoader
 Key: IMPALA-11649
 URL: https://issues.apache.org/jira/browse/IMPALA-11649
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/225/


Failed
generate_junitxml.buildall.create-load-data (from 
generate_junitxml.buildall.create-load-data)

Failing for the past 1 build (Since #225 )
Took 0 ms.
Error Message
Error in 
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/testdata/bin/create-load-data.sh
 at line 95: -timeout)


SQL


{code:java}
06:06:05 ERROR: INSERT into TABLE functional_kudu.alltypes
06:06:05 SELECT id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, 
float_col, double_col, date_string_col, string_col,
06:06:05timestamp_col, year, month
06:06:05 FROM functional.alltypes
{code}


SQL error:


{code:java}
06:06:05 ImpalaBeeswaxException: ImpalaBeeswaxException:
06:06:05  INNER EXCEPTION: 
06:06:05  MESSAGE: AnalysisException: Failed to load metadata for table: 
‘functional.alltypes'
06:06:05 CAUSED BY: TableLoadingException: Loading file and block metadata for 
24 paths for table functional.alltypes: failed to load 2 paths. Check the 
catalog server log for more details.
{code}


Catalog server log:


{code:java}
Log file created at: 2022/10/09 04:03:47
Running on machine: 
impala-ec2-centos79-m6i-4xlarge-ondemand-005a.vpc.cloudera.com
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E1009 04:03:47.665833 14343 logging.cc:248] stderr will be logged to this file.
22/10/09 04:03:48 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
22/10/09 04:03:48 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period 
at 10 second(s).
22/10/09 04:03:48 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
started
22/10/09 04:03:49 INFO Configuration.deprecation: No unit for 
fs.s3a.connection.request.timeout(0) assuming SECONDS
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics 
system...
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
stopped.
22/10/09 04:03:49 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
shutdown complete.
22/10/09 04:03:49 INFO util.JvmPauseMonitor: Starting JVM pause monitor
E1009 04:06:05.058591 19943 ParallelFileMetadataLoader.java:171] Loading file 
and block metadata for 24 paths for table functional.alltypes encountered an 
error loading data for path 
s3a://impala-test-uswest2-2/test-warehouse/alltypes/year=2010/month=4
Java exception follows:
{code}


{code:java}
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:168)
at 
org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120)
at 
org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:781)
at 
org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:744)
at 
org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:719)
at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1268)
at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1162)
at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144)
at 
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
at 
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.(Listing.java:621)
at 
org.apache.hadoop.fs.s3a.Listing.createObjectListingIterator(Listing.java:163)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:144)
at 
org.apache.hadoop.fs.s3a.Listing.getListFilesAssumingDir(Listing.java:212)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListFiles(S3AFileSystem.java:4790)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listFiles$37(S3AFileSystem.java:4732)
at

[jira] [Commented] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan

2022-10-10 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615283#comment-17615283
 ] 

Qifan Chen commented on IMPALA-11647:
-

The output width from the scan being 0B instead of 8B is due to this line of 
code: 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/ScanNode.java#L160.
Once the restriction is relaxed, we can get a better plan, where the row size 
is 8B and the # of rows is the # of files in the table. 



> Row size for source tables in a cross join query is set to 0 in query plan
> --
>
> Key: IMPALA-11647
> URL: https://issues.apache.org/jira/browse/IMPALA-11647
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Qifan Chen
>Priority: Major
>
> The row-size in the following explain output for both source tables is set to 
> 0B.  On paper, it is possible to apply the count star optimization for such 
> queries and therefore set the row-size correctly. 
> {code:java}
> explain select count(*) from store_sales a, store_sales b limit 500
> +--+
> | Explain String   |
> +--+
> | Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 |
> | Per-Host Resource Estimates: Memory=10MB |
> |  |
> | PLAN-ROOT SINK   |
> | ||
> | 06:AGGREGATE [FINALIZE]  |
> | |  output: count:merge(*)|
> | |  limit: 500|
> | |  row-size=8B cardinality=1 |
> | ||
> | 05:EXCHANGE [UNPARTITIONED]  |
> | ||
> | 03:AGGREGATE |
> | |  output: count(*)  |
> | |  row-size=8B cardinality=1 |
> | ||
> | 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST]  |
> | |  row-size=0B cardinality=8.30T |
> | ||
> | |--04:EXCHANGE [BROADCAST]   |
> | |  | |
> | |  01:SCAN HDFS [tpcds_parquet.store_sales b]|
> | | HDFS partitions=1824/1824 files=1824 size=199.83MB |
> | | row-size=0B cardinality=2.88M  |
> | ||
> | 00:SCAN HDFS [tpcds_parquet.store_sales a]   |
> |HDFS partitions=1824/1824 files=1824 size=199.83MB|
> |row-size=0B cardinality=2.88M |
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan

2022-10-10 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11647:
---

 Summary: Row size for source tables in a cross join query is set 
to 0 in query plan
 Key: IMPALA-11647
 URL: https://issues.apache.org/jira/browse/IMPALA-11647
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Qifan Chen


The row-size in the following explain output for both source tables is set to 
0B.  On paper, it is possible to apply the count star optimization for such 
queries and therefore set the row-size correctly. 

{code:java}
explain select count(*) from store_sales a, store_sales b limit 500
+--+
| Explain String   |
+--+
| Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 |
| Per-Host Resource Estimates: Memory=10MB |
|  |
| PLAN-ROOT SINK   |
| ||
| 06:AGGREGATE [FINALIZE]  |
| |  output: count:merge(*)|
| |  limit: 500|
| |  row-size=8B cardinality=1 |
| ||
| 05:EXCHANGE [UNPARTITIONED]  |
| ||
| 03:AGGREGATE |
| |  output: count(*)  |
| |  row-size=8B cardinality=1 |
| ||
| 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST]  |
| |  row-size=0B cardinality=8.30T |
| ||
| |--04:EXCHANGE [BROADCAST]   |
| |  | |
| |  01:SCAN HDFS [tpcds_parquet.store_sales b]|
| | HDFS partitions=1824/1824 files=1824 size=199.83MB |
| | row-size=0B cardinality=2.88M  |
| ||
| 00:SCAN HDFS [tpcds_parquet.store_sales a]   |
|HDFS partitions=1824/1824 files=1824 size=199.83MB|
|row-size=0B cardinality=2.88M |
+--+
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11647) Row size for source tables in a cross join query is set to 0 in query plan

2022-10-10 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11647:
---

 Summary: Row size for source tables in a cross join query is set 
to 0 in query plan
 Key: IMPALA-11647
 URL: https://issues.apache.org/jira/browse/IMPALA-11647
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Qifan Chen


The row-size in the following explain output for both source tables is set to 
0B.  On paper, it is possible to apply the count star optimization for such 
queries and therefore set the row-size correctly. 

{code:java}
explain select count(*) from store_sales a, store_sales b limit 500
+--+
| Explain String   |
+--+
| Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 |
| Per-Host Resource Estimates: Memory=10MB |
|  |
| PLAN-ROOT SINK   |
| ||
| 06:AGGREGATE [FINALIZE]  |
| |  output: count:merge(*)|
| |  limit: 500|
| |  row-size=8B cardinality=1 |
| ||
| 05:EXCHANGE [UNPARTITIONED]  |
| ||
| 03:AGGREGATE |
| |  output: count(*)  |
| |  row-size=8B cardinality=1 |
| ||
| 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST]  |
| |  row-size=0B cardinality=8.30T |
| ||
| |--04:EXCHANGE [BROADCAST]   |
| |  | |
| |  01:SCAN HDFS [tpcds_parquet.store_sales b]|
| | HDFS partitions=1824/1824 files=1824 size=199.83MB |
| | row-size=0B cardinality=2.88M  |
| ||
| 00:SCAN HDFS [tpcds_parquet.store_sales a]   |
|HDFS partitions=1824/1824 files=1824 size=199.83MB|
|row-size=0B cardinality=2.88M |
+--+
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-11617) Pool service should be made aware of cpu-usage limit for each executor group set

2022-09-27 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11617:
---

 Summary: Pool service should be made aware of cpu-usage limit for 
each executor group set
 Key: IMPALA-11617
 URL: https://issues.apache.org/jira/browse/IMPALA-11617
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


IMPALA-11604 enables the planner to compute CPU usage for certain queries and 
to select suitable executor groups to run. Here the CPU usage is expressed as 
the total amount of data to be processed per instance. 

The limit on the total amount of data that each executor group can handle 
should be provided by the pool service. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11617) Pool service should be made aware of cpu-usage limit for each executor group set

2022-09-27 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11617:
---

 Summary: Pool service should be made aware of cpu-usage limit for 
each executor group set
 Key: IMPALA-11617
 URL: https://issues.apache.org/jira/browse/IMPALA-11617
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


IMPALA-11604 enables the planner to compute CPU usage for certain queries and 
to select suitable executor groups to run. Here the CPU usage is expressed as 
the total amount of data to be processed per instance. 

The limit on the total amount of data that each executor group can handle 
should be provided by the pool service. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IMPALA-11604) Planner changes for CPU usage

2022-09-21 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-11604:
---

Assignee: Qifan Chen

> Planner changes for CPU usage
> -
>
> Key: IMPALA-11604
> URL: https://issues.apache.org/jira/browse/IMPALA-11604
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> Plan scaling based on estimated peak memory has been enabled in 
> IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such 
> as the number of data processed) as a scaling factor. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11604) Planner changes for CPU usage

2022-09-21 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11604:
---

 Summary: Planner changes for CPU usage
 Key: IMPALA-11604
 URL: https://issues.apache.org/jira/browse/IMPALA-11604
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Plan scaling based on estimated peak memory has been enabled in 
IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such as 
the number of data processed) as a scaling factor. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11604) Planner changes for CPU usage

2022-09-21 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11604:
---

 Summary: Planner changes for CPU usage
 Key: IMPALA-11604
 URL: https://issues.apache.org/jira/browse/IMPALA-11604
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Plan scaling based on estimated peak memory has been enabled in 
IMPALA-10992. However, it is sometime desirable to consider CPU-usage (such as 
the number of data processed) as a scaling factor. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IMPALA-11573) Certain methods used by the replan feature can be improved

2022-09-09 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen updated IMPALA-11573:

Description: Certain methods for replanning (IMPALA-10992) are not suitable 
to be called from Hive.  For example setupThresholdsForExecutorGroupSets() and 
canStmtBeAutoScaled() in Frontend.java are not static.   (was: Certain methods 
for auto-scaling are not suitable to be called from Hive.  For example 
setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in 
Frontend.java are not static. )
Summary:  Certain methods used by the replan feature can be improved  
(was:  Certain methods used by the auto-scaling feature can be improved)

>  Certain methods used by the replan feature can be improved
> ---
>
> Key: IMPALA-11573
> URL: https://issues.apache.org/jira/browse/IMPALA-11573
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> Certain methods for replanning (IMPALA-10992) are not suitable to be called 
> from Hive.  For example setupThresholdsForExecutorGroupSets() and 
> canStmtBeAutoScaled() in Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved

2022-09-09 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11573:
---

 Summary:  Certain methods used by the auto-scaling feature can be 
improved
 Key: IMPALA-11573
 URL: https://issues.apache.org/jira/browse/IMPALA-11573
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Certain methods for auto-scaling are not suitable to be called from Hive.  For 
example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in 
Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved

2022-09-09 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-11573:
---

Assignee: Qifan Chen

>  Certain methods used by the auto-scaling feature can be improved
> -
>
> Key: IMPALA-11573
> URL: https://issues.apache.org/jira/browse/IMPALA-11573
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> Certain methods for auto-scaling are not suitable to be called from Hive.  
> For example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() 
> in Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11573) Certain methods used by the auto-scaling feature can be improved

2022-09-09 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11573:
---

 Summary:  Certain methods used by the auto-scaling feature can be 
improved
 Key: IMPALA-11573
 URL: https://issues.apache.org/jira/browse/IMPALA-11573
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Certain methods for auto-scaling are not suitable to be called from Hive.  For 
example setupThresholdsForExecutorGroupSets() and canStmtBeAutoScaled() in 
Frontend.java are not static. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528959#comment-17528959
 ] 

Qifan Chen commented on IMPALA-11274:
-

The DDL for the above query.


{code:java}
create table if not exists a1 (
  c1 string
)
STORED AS PARQUET;

create table if not exists a4 (
  customerkey string
)
STORED AS PARQUET;

create table if not exists a5 (
  customerkey3024 string
)
STORED AS PARQUET
;

drop table if exists p;

create table if not exists p
(
client5171 string,
clientsms5171 string,
email1dc string,
email2dc string,
email5153 string,
email5170 string,
email5171 string,
global5170 string,
sms3dc string,
sms5171 string,
system5171 string,
systemsms5171 string,
systemsms string
)
STORED AS PARQUET;
{code}


> CNF Rewrite causes a regress in join node performance
> -
>
> Key: IMPALA-11274
> URL: https://issues.apache.org/jira/browse/IMPALA-11274
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> It appears that cnf rewrite can generate more predicates and presumably cause 
> the same query to execute slower.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949
 ] 

Qifan Chen edited comment on IMPALA-11274 at 4/27/22 6:22 PM:
--

For the following query


{code:java}
set explain_level = 3;  
  

  
explain 
  
select * from   
  
p, a1, a4, a5   
  
where   
  
(   
  
( coalesce(CAST(a1.c1 AS string), '') != '' )   
  

  
OR  
  

  
(   
  
(   
  
( upper(p.email5153) = '1' )
  
OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) 
  
OR ( ( upper(p.email5171) LIKE '%GMAI.COM' )
  
AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) 
  
OR ( upper(p.email5171) = 'CLARIANT.COM' )  
  
OR ( upper(p.email5171) = 'YAHOO.COM' ) 
  
OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' )  
  
)   
  

  
AND (   
  
( upper(p.global5170) != 'Y' )  
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  

  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  

  
AND ( ( upper(p.sms5171) != 'Y' )   
  
OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) )   
  

  
AND ( upper(coalesce(p.client5171, 'G')) = 'G' )
  
AND ( coalesce(CAST(p.email2dc AS string), '') = '' )   
  
AND ( coalesce(CAST(p.email1dc AS string), '') = '' )   
  
AND ( upper(coalesce(p.system5171, 'G')) = 'G' )
  
AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) 
  
AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) 
  
AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) 
  
)   
  

  
OR ( upper(p.email5153) = '4' )
OR  
  

  
( ( upper(p.email5153) = '3' )  
  
AND ( ( upper(p.global5170) != 'Y' )
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  
AND ( ( upper(p.sms5171) != 'Y' )

[jira] [Comment Edited] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949
 ] 

Qifan Chen edited comment on IMPALA-11274 at 4/27/22 6:21 PM:
--

For the following query


{code:java}
set explain_level = 3;  
  

  
explain 
  
select * from   
  
p, a1, a4, a5   
  
where   
  
(   
  
( coalesce(CAST(a1.c1 AS string), '') != '' )   
  

  
OR  
  

  
(   
  
(   
  
( upper(p.email5153) = '1' )
  
OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) 
  
OR ( ( upper(p.email5171) LIKE '%GMAI.COM' )
  
AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) 
  
OR ( upper(p.email5171) = 'CLARIANT.COM' )  
  
OR ( upper(p.email5171) = 'YAHOO.COM' ) 
  
OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' )  
  
)   
  

  
AND (   
  
( upper(p.global5170) != 'Y' )  
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  

  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  

  
AND ( ( upper(p.sms5171) != 'Y' )   
  
OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) )   
  

  
AND ( upper(coalesce(p.client5171, 'G')) = 'G' )
  
AND ( coalesce(CAST(p.email2dc AS string), '') = '' )   
  
AND ( coalesce(CAST(p.email1dc AS string), '') = '' )   
  
AND ( upper(coalesce(p.system5171, 'G')) = 'G' )
  
AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) 
  
AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) 
  
AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) 
  
)   
  

  
OR ( upper(p.email5153) = '4' )
OR  
  

  
( ( upper(p.email5153) = '3' )  
  
AND ( ( upper(p.global5170) != 'Y' )
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  
AND ( ( upper(p.sms5171) != 'Y' )

[jira] [Commented] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528949#comment-17528949
 ] 

Qifan Chen commented on IMPALA-11274:
-

For the following query


{code:java}
set explain_level = 3;  
  

  
explain 
  
select * from   
  
p, a1, a4, a5   
  
where   
  
(   
  
( coalesce(CAST(a1.c1 AS string), '') != '' )   
  

  
OR  
  

  
(   
  
(   
  
( upper(p.email5153) = '1' )
  
OR ( upper(p.email5171) = 'wjn...@yahoo.com ' ) 
  
OR ( ( upper(p.email5171) LIKE '%GMAI.COM' )
  
AND ( coalesce(CAST(a4.customerkey AS string), '') = '' ) ) 
  
OR ( upper(p.email5171) = 'CLARIANT.COM' )  
  
OR ( upper(p.email5171) = 'YAHOO.COM' ) 
  
OR ( upper(p.email5171) LIKE '%ELECTROMAILS.COM' )  
  
)   
  

  
AND (   
  
( upper(p.global5170) != 'Y' )  
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  

  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  

  
AND ( ( upper(p.sms5171) != 'Y' )   
  
OR ( coalesce(CAST(p.sms5171 AS string), '') = '' ) )   
  

  
AND ( upper(coalesce(p.client5171, 'G')) = 'G' )
  
AND ( coalesce(CAST(p.email2dc AS string), '') = '' )   
  
AND ( coalesce(CAST(p.email1dc AS string), '') = '' )   
  
AND ( upper(coalesce(p.system5171, 'G')) = 'G' )
  
AND ( upper(coalesce(p.clientsms5171, 'G')) = 'G' ) 
  
AND ( coalesce(CAST(p.sms3dc AS string), '') = '' ) 
  
AND ( upper(coalesce(p.systemsms5171, 'G')) = 'G' ) 
  
)   
  

  
OR ( upper(p.email5153) = '4' )
OR  
  

  
( ( upper(p.email5153) = '3' )  
  
AND ( ( upper(p.global5170) != 'Y' )
  
OR ( coalesce(CAST(p.global5170 AS string), '') = '' ) )
  
AND ( ( upper(p.email5170) != 'Y' ) 
  
OR ( coalesce(CAST(p.email5170 AS string), '') = '' ) ) 
  
AND ( ( upper(p.sms5171) != 'Y' )

[jira] [Assigned] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-11274:
---

Assignee: Qifan Chen

> CNF Rewrite causes a regress in join node performance
> -
>
> Key: IMPALA-11274
> URL: https://issues.apache.org/jira/browse/IMPALA-11274
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> It appears that cnf rewrite can generate more predicates and presumably cause 
> the same query to execute slower.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11274:
---

 Summary: CNF Rewrite causes a regress in join node performance
 Key: IMPALA-11274
 URL: https://issues.apache.org/jira/browse/IMPALA-11274
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Qifan Chen


It appears that cnf rewrite can generate more predicates and presumably cause 
the same query to execute slower.





--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11274) CNF Rewrite causes a regress in join node performance

2022-04-27 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11274:
---

 Summary: CNF Rewrite causes a regress in join node performance
 Key: IMPALA-11274
 URL: https://issues.apache.org/jira/browse/IMPALA-11274
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Qifan Chen


It appears that cnf rewrite can generate more predicates and presumably cause 
the same query to execute slower.





--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Resolved] (IMPALA-10992) Planner changes for estimate peak memory.

2022-03-30 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10992.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Planner changes for estimate peak memory.
> -
>
> Key: IMPALA-10992
> URL: https://issues.apache.org/jira/browse/IMPALA-10992
> Project: IMPALA
>  Issue Type: Task
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Critical
> Fix For: Impala 4.1.0
>
>
> For ability to run large queries on larger executor group mapping to 
> different resource group, we would need to identify the large queries during 
> compile time. For this identification in first phase we can use peak memory 
> estimation to classify large queries. This Jira is to keep track of that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10992) Planner changes for estimate peak memory.

2022-03-30 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10992.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Planner changes for estimate peak memory.
> -
>
> Key: IMPALA-10992
> URL: https://issues.apache.org/jira/browse/IMPALA-10992
> Project: IMPALA
>  Issue Type: Task
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Critical
> Fix For: Impala 4.1.0
>
>
> For ability to run large queries on larger executor group mapping to 
> different resource group, we would need to identify the large queries during 
> compile time. For this identification in first phase we can use peak memory 
> estimation to classify large queries. This Jira is to keep track of that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (IMPALA-11189) Concurrent insert ACID tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen updated IMPALA-11189:

Description: 
Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) can 
fail repeatedly in local catalog mode. In this case, the concurrent checker 
query (select * from ) returns duplicated rows such as reported below, 
where row [0,2] is duplicated. 

The failure can be reproduced quite easily by running the test (i.e., 
TestConcurrentAcidInserts) first, via commenting out all the tests prior to it 
in the test file tests/stress/test_acid_stress.py. 

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args  --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:


{code:java}
09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 2 items 

tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0]
 FAILED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 PASSED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]

 FAILURES 

__ 
TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] 
___
tests/stress/test_acid_stress.py:307: in test_concurrent_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   AssertionError: wid: 2
E   assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4]
E At index 3 diff: 2 != 3
E Left contains more items, first extra item: 4
E Full diff:
E - [0, 1, 2, 2, 3, 4]
E ?   ---
E + [0, 1, 2, 3, 4]
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE;

-- 2022-03-16 09:20:54,808 INFO MainThread: Started query 
28457f4c7e77cdec:c6d37319
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_concurrent_inserts_8933345c`;

-- 2022-03-16 09:20:54,877 INFO MainThread: Started query 
374bf99aea680523:48d24054
-- 2022-03-16 09:21:01,164 INFO MainThread: Created database 
"test_concurrent_inserts_8933345c" for test ID 
"stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts;

-- 2022-03-16 09:21:01,173 INFO MainThread: Started query 
20480c2a1d336d35:c2d84edd
-- executing against localhost:21000

create table test_concurrent_inserts_8933345c.test_concurrent_inserts (wid int, 
i int) TBLPROPERTIES (
'transactional_properties' = 'insert_only', 'transactional' = 'true')

[jira] [Closed] (IMPALA-11190) Test failing insert tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-11190.
---
Resolution: Duplicate

This is a duplication of case 11191. 

> Test failing insert tests are broken in local catalog mode
> --
>
> Key: IMPALA-11190
> URL: https://issues.apache.org/jira/browse/IMPALA-11190
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Qifan Chen
>Priority: Major
>
> Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail 
> repeatedly in local catalog mode. The concurrent checker query (select * from 
> ) can return rows of value -1 which should not happen.
> This is reproducible by commenting out test TestFailingAcidInserts in 
> tests/stress/test_acid_stress.py.
> Setup:
> 1. Build the impala and clear HMS in case in a bad state: 
> $IMPALA_HOME/buildall.sh -format_metastore -notests
> 2. Start the cluster in local catalog mode: 
> $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
> --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
> --catalogd_args --hms_event_polling_interval_s=1
> 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
> $IMPALA_TESTS/stress/test_acid_stress.py
> Error reported:
> Main branch failed on test est_failing_inserts
> [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
> rootLoggerLevel = INFO
> == test session starts 
> ===
> platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
> /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
> cachedir: tests/.cache
> rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
> plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
> timeout: 7200s method: signal
> collected 4 items 
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts
>  PASSED
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
>  PASSED
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
>  PASSED
> tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
>  FAILED
>  short test summary info 
> =
> FAIL 
> tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]
>  FAILURES 
> 
> _ 
> TestFailingAcidInserts.test_failing_inserts[unique_database0] 
> __
> tests/stress/test_acid_stress.py:387: in test_failing_inserts
> self._run_test_failing_inserts(unique_database, is_partitioned)
> tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
> run_tasks(writers + checkers)
> tests/stress/stress_util.py:45: in run_tasks
> pool.map_async(Task.run, tasks).get(timeout_seconds)
> ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
>  in get
> raise self._value
> E   assert 1 == 0
> E+  where 1 = len(['-1'])
> E+where ['-1'] =  object at 0x7f62144b6890>.data
> - Captured stderr setup 
> --
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> SET sync_ddl=True;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;
> -- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
> ee4cd7bba1374e44:5133f3bb
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> SET sync_ddl=True;
> -- executing against localhost:21000
> CREATE DATABASE `test_failing_inserts_b980fc6`;
> -- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
> d949fdad1d1e9d19:430dd1ca
> -- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
> "test_failing_inserts_b980fc6" for test ID 
>

[jira] [Closed] (IMPALA-11190) Test failing insert tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen closed IMPALA-11190.
---
Resolution: Duplicate

This is a duplication of case 11191. 

> Test failing insert tests are broken in local catalog mode
> --
>
> Key: IMPALA-11190
> URL: https://issues.apache.org/jira/browse/IMPALA-11190
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Qifan Chen
>Priority: Major
>
> Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail 
> repeatedly in local catalog mode. The concurrent checker query (select * from 
> ) can return rows of value -1 which should not happen.
> This is reproducible by commenting out test TestFailingAcidInserts in 
> tests/stress/test_acid_stress.py.
> Setup:
> 1. Build the impala and clear HMS in case in a bad state: 
> $IMPALA_HOME/buildall.sh -format_metastore -notests
> 2. Start the cluster in local catalog mode: 
> $IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
> --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
> --catalogd_args --hms_event_polling_interval_s=1
> 3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
> $IMPALA_TESTS/stress/test_acid_stress.py
> Error reported:
> Main branch failed on test est_failing_inserts
> [11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
> rootLoggerLevel = INFO
> == test session starts 
> ===
> platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
> /home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
> cachedir: tests/.cache
> rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
> plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
> timeout: 7200s method: signal
> collected 4 items 
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts
>  PASSED
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
>  PASSED
> tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
>  PASSED
> tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
>  FAILED
>  short test summary info 
> =
> FAIL 
> tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]
>  FAILURES 
> 
> _ 
> TestFailingAcidInserts.test_failing_inserts[unique_database0] 
> __
> tests/stress/test_acid_stress.py:387: in test_failing_inserts
> self._run_test_failing_inserts(unique_database, is_partitioned)
> tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
> run_tasks(writers + checkers)
> tests/stress/stress_util.py:45: in run_tasks
> pool.map_async(Task.run, tasks).get(timeout_seconds)
> ../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
>  in get
> raise self._value
> E   assert 1 == 0
> E+  where 1 = len(['-1'])
> E+where ['-1'] =  object at 0x7f62144b6890>.data
> - Captured stderr setup 
> --
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> SET sync_ddl=True;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;
> -- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
> ee4cd7bba1374e44:5133f3bb
> SET 
> client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
> SET sync_ddl=True;
> -- executing against localhost:21000
> CREATE DATABASE `test_failing_inserts_b980fc6`;
> -- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
> d949fdad1d1e9d19:430dd1ca
> -- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
> "test_failing_inserts_b980fc6" for test ID 
>

[jira] [Created] (IMPALA-11191) Test insert failing ACID tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11191:
---

 Summary: Test insert failing ACID tests are broken in local 
catalog mode
 Key: IMPALA-11191
 URL: https://issues.apache.org/jira/browse/IMPALA-11191
 Project: IMPALA
  Issue Type: Bug
Reporter: Qifan Chen


This test can fail in local catalog mode with rows of value -1 being returned 
which should not happen. 

To reproduce, comment out the test test_concurrent_inserts first as it can fail 
as reported in another JIRA IMPALA-11189.

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:


{code:java}
Main branch failed on test est_failing_inserts

[11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 4 items 

tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts 
PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
 PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
 PASSED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 FAILED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]

 FAILURES 

_ 
TestFailingAcidInserts.test_failing_inserts[unique_database0] 
__
tests/stress/test_acid_stress.py:387: in test_failing_inserts
self._run_test_failing_inserts(unique_database, is_partitioned)
tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   assert 1 == 0
E+  where 1 = len(['-1'])
E+where ['-1'] = .data
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;

-- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
ee4cd7bba1374e44:5133f3bb
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_failing_inserts_b980fc6`;

-- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
d949fdad1d1e9d19:430dd1ca
-- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
"test_failing_inserts_b980fc6" for test ID 
"stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_failing_inserts_b980fc6.test_inserts_fail;

-- 2022-03-16 12:03:11,073 INFO MainThread: Started query 
1742bdbc8e07861b:6a973ebe
-- executing against localhost:21000

create table test_failing_inserts_b980fc6.test_inserts_fail (i int)  
TBLPROPERTIES (
'transactional_properties' = 'insert_only',

[jira] [Created] (IMPALA-11191) Test insert failing ACID tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11191:
---

 Summary: Test insert failing ACID tests are broken in local 
catalog mode
 Key: IMPALA-11191
 URL: https://issues.apache.org/jira/browse/IMPALA-11191
 Project: IMPALA
  Issue Type: Bug
Reporter: Qifan Chen


This test can fail in local catalog mode with rows of value -1 being returned 
which should not happen. 

To reproduce, comment out the test test_concurrent_inserts first as it can fail 
as reported in another JIRA IMPALA-11189.

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:


{code:java}
Main branch failed on test est_failing_inserts

[11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 4 items 

tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts 
PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
 PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
 PASSED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 FAILED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]

 FAILURES 

_ 
TestFailingAcidInserts.test_failing_inserts[unique_database0] 
__
tests/stress/test_acid_stress.py:387: in test_failing_inserts
self._run_test_failing_inserts(unique_database, is_partitioned)
tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   assert 1 == 0
E+  where 1 = len(['-1'])
E+where ['-1'] = .data
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;

-- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
ee4cd7bba1374e44:5133f3bb
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_failing_inserts_b980fc6`;

-- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
d949fdad1d1e9d19:430dd1ca
-- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
"test_failing_inserts_b980fc6" for test ID 
"stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_failing_inserts_b980fc6.test_inserts_fail;

-- 2022-03-16 12:03:11,073 INFO MainThread: Started query 
1742bdbc8e07861b:6a973ebe
-- executing against localhost:21000

create table test_failing_inserts_b980fc6.test_inserts_fail (i int)  
TBLPROPERTIES (
'transactional_properties' = 'insert_only',

[jira] [Created] (IMPALA-11190) Test failing insert tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11190:
---

 Summary: Test failing insert tests are broken in local catalog mode
 Key: IMPALA-11190
 URL: https://issues.apache.org/jira/browse/IMPALA-11190
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail 
repeatedly in local catalog mode. The concurrent checker query (select * from 
) can return rows of value -1 which should not happen.

This is reproducible by commenting out test TestFailingAcidInserts in 
tests/stress/test_acid_stress.py.

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:

Main branch failed on test est_failing_inserts

[11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 4 items 

tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts 
PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
 PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
 PASSED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 FAILED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]

 FAILURES 

_ 
TestFailingAcidInserts.test_failing_inserts[unique_database0] 
__
tests/stress/test_acid_stress.py:387: in test_failing_inserts
self._run_test_failing_inserts(unique_database, is_partitioned)
tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   assert 1 == 0
E+  where 1 = len(['-1'])
E+where ['-1'] = .data
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;

-- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
ee4cd7bba1374e44:5133f3bb
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_failing_inserts_b980fc6`;

-- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
d949fdad1d1e9d19:430dd1ca
-- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
"test_failing_inserts_b980fc6" for test ID 
"stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_failing_inserts_b980fc6.test_inserts_fail;

-- 2022-03-16 12:03:11,073 INFO MainThread: Started query 
1742bdbc8e07861b:6a973ebe
-- executing against localhost:21000

create table

[jira] [Created] (IMPALA-11190) Test failing insert tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11190:
---

 Summary: Test failing insert tests are broken in local catalog mode
 Key: IMPALA-11190
 URL: https://issues.apache.org/jira/browse/IMPALA-11190
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


Stress test test_failing_inserts (in tests/stress/test_acid_stress.py) fail 
repeatedly in local catalog mode. The concurrent checker query (select * from 
) can return rows of value -1 which should not happen.

This is reproducible by commenting out test TestFailingAcidInserts in 
tests/stress/test_acid_stress.py.

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:

Main branch failed on test est_failing_inserts

[11:48:38 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 4 items 

tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_hive_inserts 
PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_read_impala_inserts
 PASSED
tests/stress/test_acid_stress.py::TestAcidInsertsBasic::test_partitioned_inserts[unique_database0]
 PASSED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 FAILED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]

 FAILURES 

_ 
TestFailingAcidInserts.test_failing_inserts[unique_database0] 
__
tests/stress/test_acid_stress.py:387: in test_failing_inserts
self._run_test_failing_inserts(unique_database, is_partitioned)
tests/stress/test_acid_stress.py:376: in _run_test_failing_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   assert 1 == 0
E+  where 1 = len(['-1'])
E+where ['-1'] = .data
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 12:03:05,065 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 12:03:05,077 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_failing_inserts_b980fc6` CASCADE;

-- 2022-03-16 12:03:05,084 INFO MainThread: Started query 
ee4cd7bba1374e44:5133f3bb
SET 
client_identifier=stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_failing_inserts_b980fc6`;

-- 2022-03-16 12:03:05,179 INFO MainThread: Started query 
d949fdad1d1e9d19:430dd1ca
-- 2022-03-16 12:03:11,071 INFO MainThread: Created database 
"test_failing_inserts_b980fc6" for test ID 
"stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_failing_inserts_b980fc6.test_inserts_fail;

-- 2022-03-16 12:03:11,073 INFO MainThread: Started query 
1742bdbc8e07861b:6a973ebe
-- executing against localhost:21000

create table

[jira] [Created] (IMPALA-11189) Concurrent insert ACL tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11189:
---

 Summary: Concurrent insert ACL tests are broken in local catalog 
mode
 Key: IMPALA-11189
 URL: https://issues.apache.org/jira/browse/IMPALA-11189
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) fail 
repeatedly in local catalog mode. The concurrent checker query (select * from 
) can return duplicated rows such as reported below, where row [0,2] is 
duplicated. 

This can be reproduced quite easily by running the test (i.e., 
TestConcurrentAcidInserts) first, via commenting out all the tests prior to it 
in the test file tests/stress/test_acid_stress.py. 

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args  --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:


{code:java}
09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 2 items 

tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0]
 FAILED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 PASSED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]

 FAILURES 

__ 
TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] 
___
tests/stress/test_acid_stress.py:307: in test_concurrent_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   AssertionError: wid: 2
E   assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4]
E At index 3 diff: 2 != 3
E Left contains more items, first extra item: 4
E Full diff:
E - [0, 1, 2, 2, 3, 4]
E ?   ---
E + [0, 1, 2, 3, 4]
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE;

-- 2022-03-16 09:20:54,808 INFO MainThread: Started query 
28457f4c7e77cdec:c6d37319
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_concurrent_inserts_8933345c`;

-- 2022-03-16 09:20:54,877 INFO MainThread: Started query 
374bf99aea680523:48d24054
-- 2022-03-16 09:21:01,164 INFO MainThread: Created database 
"test_concurrent_inserts_8933345c" for test ID 
"stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts;

-- 2022-03-16 09:21:01,173 INFO MainThread: Started query 
20480c2a1d336d35:c2d84edd
-- executing against localhost:21000

create table

[jira] [Created] (IMPALA-11189) Concurrent insert ACL tests are broken in local catalog mode

2022-03-16 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11189:
---

 Summary: Concurrent insert ACL tests are broken in local catalog 
mode
 Key: IMPALA-11189
 URL: https://issues.apache.org/jira/browse/IMPALA-11189
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Qifan Chen


Stress test test_concurrent_inserts (in tests/stress/test_acid_stress.py) fail 
repeatedly in local catalog mode. The concurrent checker query (select * from 
) can return duplicated rows such as reported below, where row [0,2] is 
duplicated. 

This can be reproduced quite easily by running the test (i.e., 
TestConcurrentAcidInserts) first, via commenting out all the tests prior to it 
in the test file tests/stress/test_acid_stress.py. 

Setup:

1. Build the impala and clear HMS in case in a bad state: 
$IMPALA_HOME/buildall.sh -format_metastore -notests
2. Start the cluster in local catalog mode: 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args 
--use_local_catalog=true --catalogd_args  --catalog_topic_mode=minimal 
--catalogd_args --hms_event_polling_interval_s=1
3. Run the modified stress test: $IMPALA_HOME/bin/impala-py.test 
$IMPALA_TESTS/stress/test_acid_stress.py

Error reported:


{code:java}
09:11:00 qchen@qifan-10229: Impala.03112022] test_acid_stress
rootLoggerLevel = INFO
== test session starts 
===
platform linux2 -- Python 2.7.16, pytest-2.9.2, py-1.4.32, pluggy-0.3.1 -- 
/home/qchen/Impala.03112022/infra/python/env-gcc7.5.0/bin/python
cachedir: tests/.cache
rootdir: /home/qchen/Impala.03112022/tests, inifile: pytest.ini
plugins: xdist-1.17.1, timeout-1.2.1, random-0.2, forked-0.2
timeout: 7200s method: signal
collected 2 items 

tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::test_concurrent_inserts[unique_database0]
 FAILED
tests/stress/test_acid_stress.py::TestFailingAcidInserts::test_failing_inserts[unique_database0]
 PASSED
 short test summary info 
=
FAIL 
tests/stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]

 FAILURES 

__ 
TestConcurrentAcidInserts.test_concurrent_inserts[unique_database0] 
___
tests/stress/test_acid_stress.py:307: in test_concurrent_inserts
run_tasks(writers + checkers)
tests/stress/stress_util.py:45: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
../Impala.03082022/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
 in get
raise self._value
E   AssertionError: wid: 2
E   assert [0, 1, 2, 2, 3, 4] == [0, 1, 2, 3, 4]
E At index 3 diff: 2 != 3
E Left contains more items, first extra item: 4
E Full diff:
E - [0, 1, 2, 2, 3, 4]
E ?   ---
E + [0, 1, 2, 3, 4]
- Captured stderr setup 
--
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2022-03-16 09:20:54,762 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2022-03-16 09:20:54,774 INFO MainThread: Closing active operation
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_concurrent_inserts_8933345c` CASCADE;

-- 2022-03-16 09:20:54,808 INFO MainThread: Started query 
28457f4c7e77cdec:c6d37319
SET 
client_identifier=stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0];
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_concurrent_inserts_8933345c`;

-- 2022-03-16 09:20:54,877 INFO MainThread: Started query 
374bf99aea680523:48d24054
-- 2022-03-16 09:21:01,164 INFO MainThread: Created database 
"test_concurrent_inserts_8933345c" for test ID 
"stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]"
-- Captured stderr call 
--
SET SYNC_DDL=true;
-- executing against localhost:21000

drop table if exists test_concurrent_inserts_8933345c.test_concurrent_inserts;

-- 2022-03-16 09:21:01,173 INFO MainThread: Started query 
20480c2a1d336d35:c2d84edd
-- executing against localhost:21000

create table

[jira] [Created] (IMPALA-11163) To scan small dimensional tables, the number of nodes selected by FE can be less

2022-03-07 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11163:
---

 Summary: To scan small dimensional tables, the number of nodes 
selected by FE can be less
 Key: IMPALA-11163
 URL: https://issues.apache.org/jira/browse/IMPALA-11163
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


In Impala, FE determines the # of exec nodes to use for scan based on the # of 
local/remote nodes hosting data blocks. For example for a dimensional table,  
assume its #local nodes = 3, and its #remote nodes = 17. Then # of exec nodes 
for scan is 20. The final value is min(20, #exec nodes in cluster).

In the case of a partitioned join(f, d) where f is the fact table and d is the 
dimensional table, the # of network opens from join to table d can be made less 
(say 2 instead of 20). Therefore, the system can handle more # of queries.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11163) To scan small dimensional tables, the number of nodes selected by FE can be less

2022-03-07 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11163:
---

 Summary: To scan small dimensional tables, the number of nodes 
selected by FE can be less
 Key: IMPALA-11163
 URL: https://issues.apache.org/jira/browse/IMPALA-11163
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


In Impala, FE determines the # of exec nodes to use for scan based on the # of 
local/remote nodes hosting data blocks. For example for a dimensional table,  
assume its #local nodes = 3, and its #remote nodes = 17. Then # of exec nodes 
for scan is 20. The final value is min(20, #exec nodes in cluster).

In the case of a partitioned join(f, d) where f is the fact table and d is the 
dimensional table, the # of network opens from join to table d can be made less 
(say 2 instead of 20). Therefore, the system can handle more # of queries.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (IMPALA-10754) test_overlap_min_max_filters_on_sorted_columns failed during GVO

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10754.
-
Resolution: Fixed

> test_overlap_min_max_filters_on_sorted_columns failed during GVO
> 
>
> Key: IMPALA-10754
> URL: https://issues.apache.org/jira/browse/IMPALA-10754
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.1.0
>
>
> test_overlap_min_max_filters_on_sorted_columns failed in the following build:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4338/testReport/
> *Stack trace:*
> {noformat}
> query_test/test_runtime_filters.py:296: in 
> test_overlap_min_max_filters_on_sorted_columns
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
> common/impala_test_suite.py:734: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:653: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over NumRuntimeFilteredPages did not 
> match expected results.
> E   EXPECTED VALUE:
> E   58
> E   
> E   
> E   ACTUAL VALUE:
> E   59
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10754) test_overlap_min_max_filters_on_sorted_columns failed during GVO

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10754.
-
Resolution: Fixed

> test_overlap_min_max_filters_on_sorted_columns failed during GVO
> 
>
> Key: IMPALA-10754
> URL: https://issues.apache.org/jira/browse/IMPALA-10754
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Qifan Chen
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.1.0
>
>
> test_overlap_min_max_filters_on_sorted_columns failed in the following build:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4338/testReport/
> *Stack trace:*
> {noformat}
> query_test/test_runtime_filters.py:296: in 
> test_overlap_min_max_filters_on_sorted_columns
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
> common/impala_test_suite.py:734: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:653: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over NumRuntimeFilteredPages did not 
> match expected results.
> E   EXPECTED VALUE:
> E   58
> E   
> E   
> E   ACTUAL VALUE:
> E   59
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (IMPALA-11047) Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11047.
-
Target Version: Impala 4.0.0
Resolution: Fixed

> Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if 
> PARQUET_READ_STATISTICS=0
> --
>
> Key: IMPALA-11047
> URL: https://issues.apache.org/jira/browse/IMPALA-11047
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0.0
>Reporter: Riza Suminto
>Assignee: Qifan Chen
>Priority: Major
>
> There is a conflict happening in HdfsScanNode.java vs 
> RuntimeFilterGenerator.java when initializing overlap predicate.
> In HdfsScanNode.java, computeStatsTupleAndConjuncts that init statsTuple_ 
> will not be called because PARQUET_READ_STATISTICS=0.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L409]
>  
> On the other hand, in RuntimeFilterGenerator.java, disable_overlap_filter is 
> set to false without considering what is the value of PARQUET_READ_STATISTICS.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L915]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-11047) Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11047.
-
Target Version: Impala 4.0.0
Resolution: Fixed

> Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if 
> PARQUET_READ_STATISTICS=0
> --
>
> Key: IMPALA-11047
> URL: https://issues.apache.org/jira/browse/IMPALA-11047
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0.0
>Reporter: Riza Suminto
>Assignee: Qifan Chen
>Priority: Major
>
> There is a conflict happening in HdfsScanNode.java vs 
> RuntimeFilterGenerator.java when initializing overlap predicate.
> In HdfsScanNode.java, computeStatsTupleAndConjuncts that init statsTuple_ 
> will not be called because PARQUET_READ_STATISTICS=0.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L409]
>  
> On the other hand, in RuntimeFilterGenerator.java, disable_overlap_filter is 
> set to false without considering what is the value of PARQUET_READ_STATISTICS.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L915]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11132.
-
Target Version: Impala 4.0.0
Resolution: Fixed

> Front-end test PlannerTest.testResourceRequirements can fail
> 
>
> Key: IMPALA-11132
> URL: https://issues.apache.org/jira/browse/IMPALA-11132
> Project: IMPALA
>  Issue Type: Test
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The test miscalculates per-host memory requirements, apparently due to an 
> incorrect HBase cardinality estimate:
> {code:java}
> Section DISTRIBUTEDPLAN of query:
> select * from functional_hbase.alltypessmall
> Actual does not match expected result:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
> thread-reservation=1
> ^^
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=28.57K
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=28.57K
>in pipelines: 00(GETNEXT)
> Expected:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
> thread-reservation=1
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=50
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=50
>in pipelines: 00(GETNEXT)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-24 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11132.
-
Target Version: Impala 4.0.0
Resolution: Fixed

> Front-end test PlannerTest.testResourceRequirements can fail
> 
>
> Key: IMPALA-11132
> URL: https://issues.apache.org/jira/browse/IMPALA-11132
> Project: IMPALA
>  Issue Type: Test
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The test miscalculates per-host memory requirements, apparently due to an 
> incorrect HBase cardinality estimate:
> {code:java}
> Section DISTRIBUTEDPLAN of query:
> select * from functional_hbase.alltypessmall
> Actual does not match expected result:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
> thread-reservation=1
> ^^
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=28.57K
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=28.57K
>in pipelines: 00(GETNEXT)
> Expected:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
> thread-reservation=1
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=50
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=50
>in pipelines: 00(GETNEXT)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-23 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17496958#comment-17496958
 ] 

Qifan Chen commented on IMPALA-11132:
-

The estimation for the # of rows in HBase table scan is not capped by the # of 
rows from HMS when available. 

> Front-end test PlannerTest.testResourceRequirements can fail
> 
>
> Key: IMPALA-11132
> URL: https://issues.apache.org/jira/browse/IMPALA-11132
> Project: IMPALA
>  Issue Type: Test
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The test miscalculates per-host memory requirements, apparently due to an 
> incorrect HBase cardinality estimate:
> {code:java}
> Section DISTRIBUTEDPLAN of query:
> select * from functional_hbase.alltypessmall
> Actual does not match expected result:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
> thread-reservation=1
> ^^
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=28.57K
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=28.57K
>in pipelines: 00(GETNEXT)
> Expected:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
> thread-reservation=1
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=50
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=50
>in pipelines: 00(GETNEXT)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11146) Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing

2022-02-22 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11146:
---

 Summary: Specific string: "Query aborted:Debug Action: 
FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing
 Key: IMPALA-11146
 URL: https://issues.apache.org/jira/browse/IMPALA-11146
 Project: IMPALA
  Issue Type: Test
Reporter: Qifan Chen


In some of the tests, the following string is missing:


{code:java}
Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0
{code}

This is seen in quite number of test flavors: exhaustive-release, core-asan, 
core-ubsan and core-s3.

{code:java}
Stacktrace
query_test/test_insert.py:168: in test_acid_insert_fail
multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
common/impala_test_suite.py:732: in run_test_case
self.__verify_exceptions(test_section['CATCH'], str(e), use_db)
common/impala_test_suite.py:537: in __verify_exceptions
(expected_str, actual_str)
E   AssertionError: Unexpected exception string. Expected: Query aborted:Debug 
Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0
E   Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: ParseException: Syntax error in 
line 1:...DFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 
^Encountered: :ExpectedCAUSED BY: Exception: Syntax error
Standard Error
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_acid_insert_fail_5388d22e` CASCADE;

-- 2022-02-15 08:57:34,223 INFO MainThread: Started query 
c7422e292e8aeaf8:e18e9fcb
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_acid_insert_fail_5388d22e`;

-- 2022-02-15 08:57:40,220 INFO MainThread: Started query 
6f4315ed0ab15d9e:a0d14b2d
-- 2022-02-15 08:57:46,231 INFO MainThread: Created database 
"test_acid_insert_fail_5388d22e" for test ID 
"query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:
 none | protocol: beeswax | exec_option: {'sync_ddl': 0, 'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none-unique_database0]"
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
-- executing against localhost:21000

use test_acid_insert_fail_5388d22e;

-- 2022-02-15 08:57:46,235 INFO MainThread: Started query 
7f4f6a493e47671e:9aacd50e
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=0;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=True;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- 2022-02-15 08:57:46,237 INFO MainThread: Loading query test file: 
/data/jenkins/workspace/impala-cdw-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/acid-insert-fail.test
-- executing against localhost:21000

create table insertonly_acid (i int)
  tblproperties('transactional'='true', 
'transactional_properties'='insert_only');

-- 2022-02-15 08:57:48,587 INFO MainThread: Started query 
6445bb56ec7a5801:df8df55a
-- executing against localhost:21000


insert into insertonly_acid values (1), (2);

-- 2022-02-15 08:57:54,252 INFO MainThread: Started query 
51451860e01e6ff3:c29b68c2
-- executing against localhost:21000


select * from insertonly_acid;

-- 2022-02-15 08:57:54,357 INFO MainThread: Started query 
5b42253f96ee1ce7:71267b47
-- executing against localhost:21000

set DEBUG_ACTION=FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0;

-- executing against localhost:21000

SET DEBUG_ACTION="";

-- 2022-02-15 08:57:54,420 INFO MainThread: Started query 
044c2e00732016c8:5cf949a3
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To

[jira] [Created] (IMPALA-11146) Specific string: "Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing

2022-02-22 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11146:
---

 Summary: Specific string: "Query aborted:Debug Action: 
FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0" is missing
 Key: IMPALA-11146
 URL: https://issues.apache.org/jira/browse/IMPALA-11146
 Project: IMPALA
  Issue Type: Test
Reporter: Qifan Chen


In some of the tests, the following string is missing:


{code:java}
Query aborted:Debug Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0
{code}

This is seen in quite number of test flavors: exhaustive-release, core-asan, 
core-ubsan and core-s3.

{code:java}
Stacktrace
query_test/test_insert.py:168: in test_acid_insert_fail
multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
common/impala_test_suite.py:732: in run_test_case
self.__verify_exceptions(test_section['CATCH'], str(e), use_db)
common/impala_test_suite.py:537: in __verify_exceptions
(expected_str, actual_str)
E   AssertionError: Unexpected exception string. Expected: Query aborted:Debug 
Action: FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0
E   Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: ParseException: Syntax error in 
line 1:...DFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0 
^Encountered: :ExpectedCAUSED BY: Exception: Syntax error
Standard Error
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=True;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_acid_insert_fail_5388d22e` CASCADE;

-- 2022-02-15 08:57:34,223 INFO MainThread: Started query 
c7422e292e8aeaf8:e18e9fcb
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=True;
-- executing against localhost:21000

CREATE DATABASE `test_acid_insert_fail_5388d22e`;

-- 2022-02-15 08:57:40,220 INFO MainThread: Started query 
6f4315ed0ab15d9e:a0d14b2d
-- 2022-02-15 08:57:46,231 INFO MainThread: Created database 
"test_acid_insert_fail_5388d22e" for test ID 
"query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:
 none | protocol: beeswax | exec_option: {'sync_ddl': 0, 'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none-unique_database0]"
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
-- executing against localhost:21000

use test_acid_insert_fail_5388d22e;

-- 2022-02-15 08:57:46,235 INFO MainThread: Started query 
7f4f6a493e47671e:9aacd50e
SET 
client_identifier=query_test/test_insert.py::TestInsertQueries::()::test_acid_insert_fail[compression_codec:none|protocol:beeswax|exec_option:{'sync_ddl':0;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_singl;
SET sync_ddl=0;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=True;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- 2022-02-15 08:57:46,237 INFO MainThread: Loading query test file: 
/data/jenkins/workspace/impala-cdw-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/acid-insert-fail.test
-- executing against localhost:21000

create table insertonly_acid (i int)
  tblproperties('transactional'='true', 
'transactional_properties'='insert_only');

-- 2022-02-15 08:57:48,587 INFO MainThread: Started query 
6445bb56ec7a5801:df8df55a
-- executing against localhost:21000


insert into insertonly_acid values (1), (2);

-- 2022-02-15 08:57:54,252 INFO MainThread: Started query 
51451860e01e6ff3:c29b68c2
-- executing against localhost:21000


select * from insertonly_acid;

-- 2022-02-15 08:57:54,357 INFO MainThread: Started query 
5b42253f96ee1ce7:71267b47
-- executing against localhost:21000

set DEBUG_ACTION=FIS_FAIL_HDFS_TABLE_SINK_FLUSH_FINAL:FAIL@1.0;

-- executing against localhost:21000

SET DEBUG_ACTION="";

-- 2022-02-15 08:57:54,420 INFO MainThread: Started query 
044c2e00732016c8:5cf949a3
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-11132:
---

Assignee: Qifan Chen

> Front-end test PlannerTest.testResourceRequirements can fail
> 
>
> Key: IMPALA-11132
> URL: https://issues.apache.org/jira/browse/IMPALA-11132
> Project: IMPALA
>  Issue Type: Test
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The test miscalculates per-host memory requirements, apparently due to an 
> incorrect HBase cardinality estimate:
> {code:java}
> Section DISTRIBUTEDPLAN of query:
> select * from functional_hbase.alltypessmall
> Actual does not match expected result:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
> thread-reservation=1
> ^^
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=28.57K
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=28.57K
>in pipelines: 00(GETNEXT)
> Expected:
> Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
> Per-Host Resource Estimates: Memory=10MB
> Codegen disabled by planner
> Analyzed query: SELECT * FROM functional_hbase.alltypessmall
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
> thread-reservation=1
> PLAN-ROOT SINK
> |  output exprs: functional_hbase.alltypessmall.id, 
> functional_hbase.alltypessmall.bigint_col, 
> functional_hbase.alltypessmall.bool_col, 
> functional_hbase.alltypessmall.date_string_col, 
> functional_hbase.alltypessmall.double_col, 
> functional_hbase.alltypessmall.float_col, 
> functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
> functional_hbase.alltypessmall.smallint_col, 
> functional_hbase.alltypessmall.string_col, 
> functional_hbase.alltypessmall.timestamp_col, 
> functional_hbase.alltypessmall.tinyint_col, 
> functional_hbase.alltypessmall.year
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=89B cardinality=50
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B 
> thread-reservation=1
> 00:SCAN HBASE [functional_hbase.alltypessmall]
>stored statistics:
>  table: rows=100
>  columns: all
>mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
>tuple-ids=0 row-size=89B cardinality=50
>in pipelines: 00(GETNEXT)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11132:
---

 Summary: Front-end test PlannerTest.testResourceRequirements can 
fail
 Key: IMPALA-11132
 URL: https://issues.apache.org/jira/browse/IMPALA-11132
 Project: IMPALA
  Issue Type: Test
Reporter: Qifan Chen


The test miscalculates per-host memory requirements, apparently due to an 
incorrect HBase cardinality estimate:


{code:java}
Section DISTRIBUTEDPLAN of query:
select * from functional_hbase.alltypessmall

Actual does not match expected result:
Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_hbase.alltypessmall

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
thread-reservation=1
^^
PLAN-ROOT SINK
|  output exprs: functional_hbase.alltypessmall.id, 
functional_hbase.alltypessmall.bigint_col, 
functional_hbase.alltypessmall.bool_col, 
functional_hbase.alltypessmall.date_string_col, 
functional_hbase.alltypessmall.double_col, 
functional_hbase.alltypessmall.float_col, 
functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
functional_hbase.alltypessmall.smallint_col, 
functional_hbase.alltypessmall.string_col, 
functional_hbase.alltypessmall.timestamp_col, 
functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year
|  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
01:EXCHANGE [UNPARTITIONED]
|  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
|  tuple-ids=0 row-size=89B cardinality=28.57K
|  in pipelines: 00(GETNEXT)
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1
00:SCAN HBASE [functional_hbase.alltypessmall]
   stored statistics:
 table: rows=100
 columns: all
   mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
   tuple-ids=0 row-size=89B cardinality=28.57K
   in pipelines: 00(GETNEXT)

Expected:
Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_hbase.alltypessmall

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
thread-reservation=1
PLAN-ROOT SINK
|  output exprs: functional_hbase.alltypessmall.id, 
functional_hbase.alltypessmall.bigint_col, 
functional_hbase.alltypessmall.bool_col, 
functional_hbase.alltypessmall.date_string_col, 
functional_hbase.alltypessmall.double_col, 
functional_hbase.alltypessmall.float_col, 
functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
functional_hbase.alltypessmall.smallint_col, 
functional_hbase.alltypessmall.string_col, 
functional_hbase.alltypessmall.timestamp_col, 
functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year
|  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
01:EXCHANGE [UNPARTITIONED]
|  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|  tuple-ids=0 row-size=89B cardinality=50
|  in pipelines: 00(GETNEXT)
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1
00:SCAN HBASE [functional_hbase.alltypessmall]
   stored statistics:
 table: rows=100
 columns: all
   mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
   tuple-ids=0 row-size=89B cardinality=50
   in pipelines: 00(GETNEXT)
{code}





--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11132) Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11132:
---

 Summary: Front-end test PlannerTest.testResourceRequirements can 
fail
 Key: IMPALA-11132
 URL: https://issues.apache.org/jira/browse/IMPALA-11132
 Project: IMPALA
  Issue Type: Test
Reporter: Qifan Chen


The test miscalculates per-host memory requirements, apparently due to an 
incorrect HBase cardinality estimate:


{code:java}
Section DISTRIBUTEDPLAN of query:
select * from functional_hbase.alltypessmall

Actual does not match expected result:
Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_hbase.alltypessmall

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=5.08MB mem-reservation=4.00MB 
thread-reservation=1
^^
PLAN-ROOT SINK
|  output exprs: functional_hbase.alltypessmall.id, 
functional_hbase.alltypessmall.bigint_col, 
functional_hbase.alltypessmall.bool_col, 
functional_hbase.alltypessmall.date_string_col, 
functional_hbase.alltypessmall.double_col, 
functional_hbase.alltypessmall.float_col, 
functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
functional_hbase.alltypessmall.smallint_col, 
functional_hbase.alltypessmall.string_col, 
functional_hbase.alltypessmall.timestamp_col, 
functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year
|  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
01:EXCHANGE [UNPARTITIONED]
|  mem-estimate=1.08MB mem-reservation=0B thread-reservation=0
|  tuple-ids=0 row-size=89B cardinality=28.57K
|  in pipelines: 00(GETNEXT)
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1
00:SCAN HBASE [functional_hbase.alltypessmall]
   stored statistics:
 table: rows=100
 columns: all
   mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
   tuple-ids=0 row-size=89B cardinality=28.57K
   in pipelines: 00(GETNEXT)

Expected:
Max Per-Host Resource Reservation: Memory=4.00MB Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_hbase.alltypessmall

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB 
thread-reservation=1
PLAN-ROOT SINK
|  output exprs: functional_hbase.alltypessmall.id, 
functional_hbase.alltypessmall.bigint_col, 
functional_hbase.alltypessmall.bool_col, 
functional_hbase.alltypessmall.date_string_col, 
functional_hbase.alltypessmall.double_col, 
functional_hbase.alltypessmall.float_col, 
functional_hbase.alltypessmall.int_col, functional_hbase.alltypessmall.month, 
functional_hbase.alltypessmall.smallint_col, 
functional_hbase.alltypessmall.string_col, 
functional_hbase.alltypessmall.timestamp_col, 
functional_hbase.alltypessmall.tinyint_col, functional_hbase.alltypessmall.year
|  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
01:EXCHANGE [UNPARTITIONED]
|  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|  tuple-ids=0 row-size=89B cardinality=50
|  in pipelines: 00(GETNEXT)
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
Per-Host Resources: mem-estimate=4.00KB mem-reservation=0B thread-reservation=1
00:SCAN HBASE [functional_hbase.alltypessmall]
   stored statistics:
 table: rows=100
 columns: all
   mem-estimate=4.00KB mem-reservation=0B thread-reservation=0
   tuple-ids=0 row-size=89B cardinality=50
   in pipelines: 00(GETNEXT)
{code}





--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (IMPALA-11122) Handle false "Corrupted stats" warnings

2022-02-14 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11122:
---

 Summary: Handle false "Corrupted stats" warnings
 Key: IMPALA-11122
 URL: https://issues.apache.org/jira/browse/IMPALA-11122
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


In at least one Parquet implementation, it is possible for the writer to write 
out parquet files whose row group contains 0 row. However, the file still 
contains footer metadata, so it is not 0 byte.

In show table stats report, it is possible that some row in the #rows column is 
0, while in the Size column a positive value (for number of bytes) is present. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (IMPALA-11122) Handle false "Corrupted stats" warnings

2022-02-14 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11122:
---

 Summary: Handle false "Corrupted stats" warnings
 Key: IMPALA-11122
 URL: https://issues.apache.org/jira/browse/IMPALA-11122
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


In at least one Parquet implementation, it is possible for the writer to write 
out parquet files whose row group contains 0 row. However, the file still 
contains footer metadata, so it is not 0 byte.

In show table stats report, it is possible that some row in the #rows column is 
0, while in the Size column a positive value (for number of bytes) is present. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-11090) Need a method to specify the current number of executors for an executor group

2022-01-26 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11090:
---

 Summary: Need a method to specify the current number of executors 
for an executor group
 Key: IMPALA-11090
 URL: https://issues.apache.org/jira/browse/IMPALA-11090
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Currently, impalad accepts -num_expected_executors for the expected number of 
executors of a group.  The feature is very useful in testing auto-scaling 
feature (IMPALA-10992). 

It will be very nice to specify the current number of executors as a new 
parameter to impalad.

Both the current number of executors and the expected number of executors are 
important parameters for an executor group. 




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (IMPALA-11090) Need a method to specify the current number of executors for an executor group

2022-01-26 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11090:
---

 Summary: Need a method to specify the current number of executors 
for an executor group
 Key: IMPALA-11090
 URL: https://issues.apache.org/jira/browse/IMPALA-11090
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Currently, impalad accepts -num_expected_executors for the expected number of 
executors of a group.  The feature is very useful in testing auto-scaling 
feature (IMPALA-10992). 

It will be very nice to specify the current number of executors as a new 
parameter to impalad.

Both the current number of executors and the expected number of executors are 
important parameters for an executor group. 




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-17 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11006.
-
Fix Version/s: Impala 4.0.1
   Resolution: Fixed

> Impalad crashes during query cancel tests
> -
>
> Key: IMPALA-11006
> URL: https://issues.apache.org/jira/browse/IMPALA-11006
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.0.1
>
>
> The following stack trace was observed in a core generated during S3 build. 
> {quote}Thread 485 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x6698   rdi = 0x144f
> rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
>  r8 = 0xr9 = 0x7f89466b4d20
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07ea0100   r13 = 0x005a
> r14 = 0x07ea0104   r15 = 0x07e98720
> rip = 0x7f8a65f091f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
> rip = 0x056aa19b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52d0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x056add99
> Found by: call frame info
>  3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
> [client-request-state.cc : 1540 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52e0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286fef1
> Found by: call frame info
>  4  impalad!impala::ClientRequestState::WaitInternal() 
> [client-request-state.cc : 1083 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b5550
> rsp = 0x7f89466b5400   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286ab3b
> Found by: call frame info
>  5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 
> 1010 + 0x19]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
> rsp = 0x7f89466b5560   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286a030
> Found by: call frame info
>  6  impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> [mem_fn_template.hpp : 49 + 0x5]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
> rsp = 0x7f89466b5650   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287faed
> Found by: call frame info
>  7  impalad!void 
> boost::_bi::list1 
> >::operator(), 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
> rsp = 0x7f89466b5670   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287f259
> Found by: call frame info
>  8  impalad!boost::_bi::bind_t impala::ClientRequestState>, 
> boost::_bi::list1 > 
> >::operator()() [bind.hpp : 1222 + 0x22]
> rbx = 0x6698   rbp = 0x7f89466b56f0
> rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287e2f3
> Found by: call frame info
>  9  
> impalad!boost::detail::function::void_function_obj_invoker0  boost::_mfi::mf0, 
> boost::_bi::list1 > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> [function_template.hpp : 159 + 0xc]
> rbx = 0x6698   rbp = 0x7f89466b5720
> rsp = 0x7f89466b5700   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287d07e
> Found by: call frame info
> 10  impalad!boost::function0::operator()() const [function_template.hpp 
> : 770 + 0x1d]
> rbx = 0x6698   rbp = 0x7f89466b5760
> rsp = 0x7f89466b5730   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 =

[jira] [Resolved] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-17 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-11006.
-
Fix Version/s: Impala 4.0.1
   Resolution: Fixed

> Impalad crashes during query cancel tests
> -
>
> Key: IMPALA-11006
> URL: https://issues.apache.org/jira/browse/IMPALA-11006
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.0.1
>
>
> The following stack trace was observed in a core generated during S3 build. 
> {quote}Thread 485 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x6698   rdi = 0x144f
> rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
>  r8 = 0xr9 = 0x7f89466b4d20
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07ea0100   r13 = 0x005a
> r14 = 0x07ea0104   r15 = 0x07e98720
> rip = 0x7f8a65f091f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
> rip = 0x056aa19b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52d0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x056add99
> Found by: call frame info
>  3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
> [client-request-state.cc : 1540 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52e0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286fef1
> Found by: call frame info
>  4  impalad!impala::ClientRequestState::WaitInternal() 
> [client-request-state.cc : 1083 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b5550
> rsp = 0x7f89466b5400   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286ab3b
> Found by: call frame info
>  5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 
> 1010 + 0x19]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
> rsp = 0x7f89466b5560   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286a030
> Found by: call frame info
>  6  impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> [mem_fn_template.hpp : 49 + 0x5]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
> rsp = 0x7f89466b5650   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287faed
> Found by: call frame info
>  7  impalad!void 
> boost::_bi::list1 
> >::operator(), 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
> rsp = 0x7f89466b5670   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287f259
> Found by: call frame info
>  8  impalad!boost::_bi::bind_t impala::ClientRequestState>, 
> boost::_bi::list1 > 
> >::operator()() [bind.hpp : 1222 + 0x22]
> rbx = 0x6698   rbp = 0x7f89466b56f0
> rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287e2f3
> Found by: call frame info
>  9  
> impalad!boost::detail::function::void_function_obj_invoker0  boost::_mfi::mf0, 
> boost::_bi::list1 > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> [function_template.hpp : 159 + 0xc]
> rbx = 0x6698   rbp = 0x7f89466b5720
> rsp = 0x7f89466b5700   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287d07e
> Found by: call frame info
> 10  impalad!boost::function0::operator()() const [function_template.hpp 
> : 770 + 0x1d]
> rbx = 0x6698   rbp = 0x7f89466b5760
> rsp = 0x7f89466b5730   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 =

[jira] [Commented] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-17 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445375#comment-17445375
 ] 

Qifan Chen commented on IMPALA-11006:
-

To reproduce,  run the following script and hit control-C after the message 
showing the query monitoring URL  is displayed.

drop table if exists ctas_cancel;
set debug_action=CRS_DELAY_BEFORE_CATALOG_OP_EXEC:SLEEP@1;  

create table ctas_cancel primary key (l_orderkey, l_partkey, l_suppkey, 
l_linenumber) partition by hash partitions 3 
stored as kudu 
as select * from tpch_kudu.lineitem order by l_orderkey;

> Impalad crashes during query cancel tests
> -
>
> Key: IMPALA-11006
> URL: https://issues.apache.org/jira/browse/IMPALA-11006
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The following stack trace was observed in a core generated during S3 build. 
> {quote}Thread 485 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x6698   rdi = 0x144f
> rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
>  r8 = 0xr9 = 0x7f89466b4d20
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07ea0100   r13 = 0x005a
> r14 = 0x07ea0104   r15 = 0x07e98720
> rip = 0x7f8a65f091f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
> rip = 0x056aa19b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52d0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x056add99
> Found by: call frame info
>  3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
> [client-request-state.cc : 1540 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52e0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286fef1
> Found by: call frame info
>  4  impalad!impala::ClientRequestState::WaitInternal() 
> [client-request-state.cc : 1083 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b5550
> rsp = 0x7f89466b5400   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286ab3b
> Found by: call frame info
>  5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 
> 1010 + 0x19]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
> rsp = 0x7f89466b5560   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286a030
> Found by: call frame info
>  6  impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> [mem_fn_template.hpp : 49 + 0x5]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
> rsp = 0x7f89466b5650   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287faed
> Found by: call frame info
>  7  impalad!void 
> boost::_bi::list1 
> >::operator(), 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
> rsp = 0x7f89466b5670   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287f259
> Found by: call frame info
>  8  impalad!boost::_bi::bind_t impala::ClientRequestState>, 
> boost::_bi::list1 > 
> >::operator()() [bind.hpp : 1222 + 0x22]
> rbx = 0x6698   rbp = 0x7f89466b56f0
> rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287e2f3
> Found by: call frame info
>  9  
> impalad!boost::detail::function::void_function_obj_invoker0  boost::_mfi::mf0, 
> boost::_bi::list1 > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> [function_template.hpp : 159 + 0xc]
> rbx = 0x6698   rbp = 0x7f89466b5720
> rsp = 0x7f89466b5700   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
>

[jira] [Created] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-04 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11006:
---

 Summary: Impalad crashes during query cancel tests
 Key: IMPALA-11006
 URL: https://issues.apache.org/jira/browse/IMPALA-11006
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Qifan Chen


The following stack trace was observed in a core generated during S3 build. 

{quote}Thread 485 (crashed)
 0  libc-2.17.so + 0x351f7
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x0004
rsi = 0x6698   rdi = 0x144f
rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
 r8 = 0xr9 = 0x7f89466b4d20
r10 = 0x0008   r11 = 0x0202
r12 = 0x07ea0100   r13 = 0x005a
r14 = 0x07ea0104   r15 = 0x07e98720
rip = 0x7f8a65f091f7
Found by: given as instruction pointer in context
 1  impalad!google::LogMessage::Flush() + 0x1eb
rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
rip = 0x056aa19b
Found by: previous frame's frame pointer
 2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
rbx = 0x0001   rbp = 0x7f89466b53f0
rsp = 0x7f89466b52d0   r12 = 0x07ea7638
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x056add99
Found by: call frame info
 3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
[client-request-state.cc : 1540 + 0xf]
rbx = 0x0001   rbp = 0x7f89466b53f0
rsp = 0x7f89466b52e0   r12 = 0x07ea7638
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286fef1
Found by: call frame info
 4  impalad!impala::ClientRequestState::WaitInternal() [client-request-state.cc 
: 1083 + 0xf]
rbx = 0x0001   rbp = 0x7f89466b5550
rsp = 0x7f89466b5400   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286ab3b
Found by: call frame info
 5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 1010 
+ 0x19]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
rsp = 0x7f89466b5560   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286a030
Found by: call frame info
 6  impalad!boost::_mfi::mf0::operator()(impala::ClientRequestState*) const 
[mem_fn_template.hpp : 49 + 0x5]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
rsp = 0x7f89466b5650   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287faed
Found by: call frame info
 7  impalad!void 
boost::_bi::list1 
>::operator(), 
boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
rsp = 0x7f89466b5670   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287f259
Found by: call frame info
 8  impalad!boost::_bi::bind_t, 
boost::_bi::list1 > 
>::operator()() [bind.hpp : 1222 + 0x22]
rbx = 0x6698   rbp = 0x7f89466b56f0
rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287e2f3
Found by: call frame info
 9  
impalad!boost::detail::function::void_function_obj_invoker0, 
boost::_bi::list1 > >, 
void>::invoke(boost::detail::function::function_buffer&) [function_template.hpp 
: 159 + 0xc]
rbx = 0x6698   rbp = 0x7f89466b5720
rsp = 0x7f89466b5700   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287d07e
Found by: call frame info
10  impalad!boost::function0::operator()() const [function_template.hpp : 
770 + 0x1d]
rbx = 0x6698   rbp = 0x7f89466b5760
rsp = 0x7f89466b5730   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x022e55c0
Found by: call frame info
11  impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string, std::allocator > const&, 
std::__cxx11::basic_string, std::allocator > 
const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) [thread.cc : 360 + 0xf]
rbx = 0x6698   rbp = 0x7f89466b5af0
rsp = 0x7f89466b5770   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x02aab96a
Found by: call frame info
12  impalad!void

[jira] [Assigned] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-04 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-11006:
---

Assignee: Qifan Chen

> Impalad crashes during query cancel tests
> -
>
> Key: IMPALA-11006
> URL: https://issues.apache.org/jira/browse/IMPALA-11006
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> The following stack trace was observed in a core generated during S3 build. 
> {quote}Thread 485 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x6698   rdi = 0x144f
> rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
>  r8 = 0xr9 = 0x7f89466b4d20
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07ea0100   r13 = 0x005a
> r14 = 0x07ea0104   r15 = 0x07e98720
> rip = 0x7f8a65f091f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
> rip = 0x056aa19b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52d0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x056add99
> Found by: call frame info
>  3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
> [client-request-state.cc : 1540 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b53f0
> rsp = 0x7f89466b52e0   r12 = 0x07ea7638
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286fef1
> Found by: call frame info
>  4  impalad!impala::ClientRequestState::WaitInternal() 
> [client-request-state.cc : 1083 + 0xf]
> rbx = 0x0001   rbp = 0x7f89466b5550
> rsp = 0x7f89466b5400   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286ab3b
> Found by: call frame info
>  5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 
> 1010 + 0x19]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
> rsp = 0x7f89466b5560   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0286a030
> Found by: call frame info
>  6  impalad!boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> [mem_fn_template.hpp : 49 + 0x5]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
> rsp = 0x7f89466b5650   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287faed
> Found by: call frame info
>  7  impalad!void 
> boost::_bi::list1 
> >::operator(), 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0 impala::ClientRequestState>&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
> rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
> rsp = 0x7f89466b5670   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287f259
> Found by: call frame info
>  8  impalad!boost::_bi::bind_t impala::ClientRequestState>, 
> boost::_bi::list1 > 
> >::operator()() [bind.hpp : 1222 + 0x22]
> rbx = 0x6698   rbp = 0x7f89466b56f0
> rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287e2f3
> Found by: call frame info
>  9  
> impalad!boost::detail::function::void_function_obj_invoker0  boost::_mfi::mf0, 
> boost::_bi::list1 > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> [function_template.hpp : 159 + 0xc]
> rbx = 0x6698   rbp = 0x7f89466b5720
> rsp = 0x7f89466b5700   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x0287d07e
> Found by: call frame info
> 10  impalad!boost::function0::operator()() const [function_template.hpp 
> : 770 + 0x1d]
> rbx = 0x6698   rbp = 0x7f89466b5760
> rsp = 0x7f89466b5730   r12 = 0x091cb7a0
> r13 = 0x7f89556db190   r14 = 0x16630d20
> r15 = 0x002c   rip = 0x022e55c0
> Found by: call

[jira] [Created] (IMPALA-11006) Impalad crashes during query cancel tests

2021-11-04 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-11006:
---

 Summary: Impalad crashes during query cancel tests
 Key: IMPALA-11006
 URL: https://issues.apache.org/jira/browse/IMPALA-11006
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Qifan Chen


The following stack trace was observed in a core generated during S3 build. 

{quote}Thread 485 (crashed)
 0  libc-2.17.so + 0x351f7
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x0004
rsi = 0x6698   rdi = 0x144f
rbp = 0x7f89466b5220   rsp = 0x7f89466b4ea8
 r8 = 0xr9 = 0x7f89466b4d20
r10 = 0x0008   r11 = 0x0202
r12 = 0x07ea0100   r13 = 0x005a
r14 = 0x07ea0104   r15 = 0x07e98720
rip = 0x7f8a65f091f7
Found by: given as instruction pointer in context
 1  impalad!google::LogMessage::Flush() + 0x1eb
rbp = 0x7f89466b5350   rsp = 0x7f89466b5230
rip = 0x056aa19b
Found by: previous frame's frame pointer
 2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
rbx = 0x0001   rbp = 0x7f89466b53f0
rsp = 0x7f89466b52d0   r12 = 0x07ea7638
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x056add99
Found by: call frame info
 3  impalad!impala::ClientRequestState::SetCreateTableAsSelectResultSet() 
[client-request-state.cc : 1540 + 0xf]
rbx = 0x0001   rbp = 0x7f89466b53f0
rsp = 0x7f89466b52e0   r12 = 0x07ea7638
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286fef1
Found by: call frame info
 4  impalad!impala::ClientRequestState::WaitInternal() [client-request-state.cc 
: 1083 + 0xf]
rbx = 0x0001   rbp = 0x7f89466b5550
rsp = 0x7f89466b5400   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286ab3b
Found by: call frame info
 5  impalad!impala::ClientRequestState::Wait() [client-request-state.cc : 1010 
+ 0x19]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b5640
rsp = 0x7f89466b5560   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0286a030
Found by: call frame info
 6  impalad!boost::_mfi::mf0::operator()(impala::ClientRequestState*) const 
[mem_fn_template.hpp : 49 + 0x5]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b5660
rsp = 0x7f89466b5650   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287faed
Found by: call frame info
 7  impalad!void 
boost::_bi::list1 
>::operator(), 
boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) [bind.hpp : 259 + 0x35]
rbx = 0x7f89466b5b28   rbp = 0x7f89466b56a0
rsp = 0x7f89466b5670   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287f259
Found by: call frame info
 8  impalad!boost::_bi::bind_t, 
boost::_bi::list1 > 
>::operator()() [bind.hpp : 1222 + 0x22]
rbx = 0x6698   rbp = 0x7f89466b56f0
rsp = 0x7f89466b56b0   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287e2f3
Found by: call frame info
 9  
impalad!boost::detail::function::void_function_obj_invoker0, 
boost::_bi::list1 > >, 
void>::invoke(boost::detail::function::function_buffer&) [function_template.hpp 
: 159 + 0xc]
rbx = 0x6698   rbp = 0x7f89466b5720
rsp = 0x7f89466b5700   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x0287d07e
Found by: call frame info
10  impalad!boost::function0::operator()() const [function_template.hpp : 
770 + 0x1d]
rbx = 0x6698   rbp = 0x7f89466b5760
rsp = 0x7f89466b5730   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x022e55c0
Found by: call frame info
11  impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string, std::allocator > const&, 
std::__cxx11::basic_string, std::allocator > 
const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) [thread.cc : 360 + 0xf]
rbx = 0x6698   rbp = 0x7f89466b5af0
rsp = 0x7f89466b5770   r12 = 0x091cb7a0
r13 = 0x7f89556db190   r14 = 0x16630d20
r15 = 0x002c   rip = 0x02aab96a
Found by: call frame info
12  impalad!void

[jira] [Resolved] (IMPALA-10967) Load data should handle AWS NLB-type timeout

2021-11-03 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10967.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Load data should handle AWS NLB-type timeout
> 
>
> Key: IMPALA-10967
> URL: https://issues.apache.org/jira/browse/IMPALA-10967
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.1.0
>
>
> Currently,  since Impala handles the load data statement request in a single 
> thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if 
> the data loading takes more than 350s to complete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10967) Load data should handle AWS NLB-type timeout

2021-11-03 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10967.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Load data should handle AWS NLB-type timeout
> 
>
> Key: IMPALA-10967
> URL: https://issues.apache.org/jira/browse/IMPALA-10967
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
> Fix For: Impala 4.1.0
>
>
> Currently,  since Impala handles the load data statement request in a single 
> thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if 
> the data loading takes more than 350s to complete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (IMPALA-10992) Planner changes for estimate peak memory.

2021-10-28 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-10992:
---

Assignee: Qifan Chen

> Planner changes for estimate peak memory.
> -
>
> Key: IMPALA-10992
> URL: https://issues.apache.org/jira/browse/IMPALA-10992
> Project: IMPALA
>  Issue Type: Task
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Major
>
> For ability to run large queries on larger executor group mapping to 
> different resource group, we would need to identify the large queries during 
> compile time. For this identification in first phase we can use peak memory 
> estimation to classify large queries. This Jira is to keep track of that 
> support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.

2021-10-25 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10811.
-
Target Version: Impala 4.1.0
Resolution: Fixed

> RPC to submit query getting stuck for AWS NLB forever.
> --
>
> Key: IMPALA-10811
> URL: https://issues.apache.org/jira/browse/IMPALA-10811
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Major
> Attachments: profile+(13).txt
>
>
> Initial RPC to submit a query and fetch the query handle can take quite long 
> time to return as it can do various operations for planning and submission 
> that involve executing  Catalog Operations like Rename, Alter Table Recover 
> partition  that can take time on tables with many 
> partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]).
>  Attached is the profile of one such DDL query (with few fields hidden).
> These RPCs are: 
> 1. Beeswax:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57]
> 2. HS2:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462]
>  
> One of the side effects of such RPC taking long time is that clients such as 
> impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks 
> and closes connections after 350s and cannot be configured. But after closing 
> the connection it doesn;t send TCP RST to the client. Only when client tries 
> to send data or packets NLB issues back TCP RST to indicate connection is not 
> alive. Documentation is here: 
> [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout].
>  Hence the impala-shell waiting for RPC to return gets stuck indefinitely.
> Hence, we may need to evaluate techniques for RPCs to return query handle 
> after
>  # Creating Driver: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150]
>  # Register Query: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]
>  and execute later parts of RPC asynchronously in different thread without 
> blocking the RPC. That way clients can get query handle and poll for it for 
> state and results.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.

2021-10-25 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen resolved IMPALA-10811.
-
Target Version: Impala 4.1.0
Resolution: Fixed

> RPC to submit query getting stuck for AWS NLB forever.
> --
>
> Key: IMPALA-10811
> URL: https://issues.apache.org/jira/browse/IMPALA-10811
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Major
> Attachments: profile+(13).txt
>
>
> Initial RPC to submit a query and fetch the query handle can take quite long 
> time to return as it can do various operations for planning and submission 
> that involve executing  Catalog Operations like Rename, Alter Table Recover 
> partition  that can take time on tables with many 
> partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]).
>  Attached is the profile of one such DDL query (with few fields hidden).
> These RPCs are: 
> 1. Beeswax:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57]
> 2. HS2:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462]
>  
> One of the side effects of such RPC taking long time is that clients such as 
> impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks 
> and closes connections after 350s and cannot be configured. But after closing 
> the connection it doesn;t send TCP RST to the client. Only when client tries 
> to send data or packets NLB issues back TCP RST to indicate connection is not 
> alive. Documentation is here: 
> [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout].
>  Hence the impala-shell waiting for RPC to return gets stuck indefinitely.
> Hence, we may need to evaluate techniques for RPCs to return query handle 
> after
>  # Creating Driver: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150]
>  # Register Query: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]
>  and execute later parts of RPC asynchronously in different thread without 
> blocking the RPC. That way clients can get query handle and poll for it for 
> state and results.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.

2021-10-25 Thread Qifan Chen (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433755#comment-17433755
 ] 

Qifan Chen commented on IMPALA-10811:
-

The major work was done in commit 975883c47035843398ee99a21fa132f67a0d4954.  

The remaining work on load data is separately tracked in IMPALA-10967.

> RPC to submit query getting stuck for AWS NLB forever.
> --
>
> Key: IMPALA-10811
> URL: https://issues.apache.org/jira/browse/IMPALA-10811
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Major
> Attachments: profile+(13).txt
>
>
> Initial RPC to submit a query and fetch the query handle can take quite long 
> time to return as it can do various operations for planning and submission 
> that involve executing  Catalog Operations like Rename, Alter Table Recover 
> partition  that can take time on tables with many 
> partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]).
>  Attached is the profile of one such DDL query (with few fields hidden).
> These RPCs are: 
> 1. Beeswax:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57]
> 2. HS2:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462]
>  
> One of the side effects of such RPC taking long time is that clients such as 
> impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks 
> and closes connections after 350s and cannot be configured. But after closing 
> the connection it doesn;t send TCP RST to the client. Only when client tries 
> to send data or packets NLB issues back TCP RST to indicate connection is not 
> alive. Documentation is here: 
> [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout].
>  Hence the impala-shell waiting for RPC to return gets stuck indefinitely.
> Hence, we may need to evaluate techniques for RPCs to return query handle 
> after
>  # Creating Driver: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150]
>  # Register Query: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]
>  and execute later parts of RPC asynchronously in different thread without 
> blocking the RPC. That way clients can get query handle and poll for it for 
> state and results.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.

2021-10-25 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-10811:
---

Assignee: Qifan Chen  (was: Joe McDonnell)

> RPC to submit query getting stuck for AWS NLB forever.
> --
>
> Key: IMPALA-10811
> URL: https://issues.apache.org/jira/browse/IMPALA-10811
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Major
> Attachments: profile+(13).txt
>
>
> Initial RPC to submit a query and fetch the query handle can take quite long 
> time to return as it can do various operations for planning and submission 
> that involve executing  Catalog Operations like Rename, Alter Table Recover 
> partition  that can take time on tables with many 
> partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]).
>  Attached is the profile of one such DDL query (with few fields hidden).
> These RPCs are: 
> 1. Beeswax:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57]
> 2. HS2:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462]
>  
> One of the side effects of such RPC taking long time is that clients such as 
> impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks 
> and closes connections after 350s and cannot be configured. But after closing 
> the connection it doesn;t send TCP RST to the client. Only when client tries 
> to send data or packets NLB issues back TCP RST to indicate connection is not 
> alive. Documentation is here: 
> [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout].
>  Hence the impala-shell waiting for RPC to return gets stuck indefinitely.
> Hence, we may need to evaluate techniques for RPCs to return query handle 
> after
>  # Creating Driver: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150]
>  # Register Query: 
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]
>  and execute later parts of RPC asynchronously in different thread without 
> blocking the RPC. That way clients can get query handle and poll for it for 
> state and results.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-10967) Load data should handle AWS NLB-type timeout

2021-10-19 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-10967:
---

Assignee: Qifan Chen

> Load data should handle AWS NLB-type timeout
> 
>
> Key: IMPALA-10967
> URL: https://issues.apache.org/jira/browse/IMPALA-10967
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> Currently,  since Impala handles the load data statement request in a single 
> thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if 
> the data loading takes more than 350s to complete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-10967) Load data should handle AWS NLB-type timeout

2021-10-12 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-10967:
---

 Summary: Load data should handle AWS NLB-type timeout
 Key: IMPALA-10967
 URL: https://issues.apache.org/jira/browse/IMPALA-10967
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Qifan Chen


Currently,  since Impala handles the load data statement request in a single 
thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if 
the data loading takes more than 350s to complete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IMPALA-10967) Load data should handle AWS NLB-type timeout

2021-10-12 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-10967:
---

 Summary: Load data should handle AWS NLB-type timeout
 Key: IMPALA-10967
 URL: https://issues.apache.org/jira/browse/IMPALA-10967
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Qifan Chen


Currently,  since Impala handles the load data statement request in a single 
thread, the client can experience AWS NLB-type timeout (see IMPALA-10811) if 
the data loading takes more than 350s to complete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-10927:
---

Assignee: Qifan Chen

> TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
> ---
>
> Key: IMPALA-10927
> URL: https://issues.apache.org/jira/browse/IMPALA-10927
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Major
>
> RowsSentRate counter is seen with a value of 0 and the following is the error 
> report. A fix in the similar area was described in IMPALA-8957, where a delay 
> via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.
> {code:java}
> query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] (from pytest)
> Failing for the past 1 build (Since Failed#206 )
> Took 0.47 sec.
> Error Message
> query_test/test_fetch.py:101: in test_rows_sent_counters assert 
> float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
> float('0') E+where '0' =  object at 0x7f9d22693030>(1) E+  where  _sre.SRE_Match object at 0x7f9d22693030> = <_sre.SRE_Match object at 
> 0x7f9d22693030>.group
> Stacktrace
> query_test/test_fetch.py:101: in test_rows_sent_counters
> assert float(rows_sent_rate.group(1)) > 0
> E   assert 0.0 > 0
> E+  where 0.0 = float('0')
> E+where '0' =  0x7f9d22693030>(1)
> E+  where  0x7f9d22693030> = <_sre.SRE_Match object at 0x7f9d22693030>.group
> Standard Error
> SET 
> client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
> -- connecting to localhost:11050 with impyla
> -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
> 11050, 0, 0)
> -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of 
> [('::1', 11050, 0, 0), ('127.0.0.1', 11050)]
> -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup 
> failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), 
> ('127.0.0.1', 11050)]
> SET 
> client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select id from functional.alltypes limit 10;
> -- 2021-09-10 04:41:56,988 INFO MainThread: Started query 
> b04941a75e31:1da6c8eb
> Options
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen updated IMPALA-10927:

Description: 
RowsSentRate counter is seen with a value of 0 and the following is the error 
report. A fix in the similar area was described in IMPALA-8957, where a delay 
via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.

{code:java}
query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#206 )
Took 0.47 sec.
Error Message
query_test/test_fetch.py:101: in test_rows_sent_counters assert 
float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
float('0') E+where '0' = (1) E+  where  = <_sre.SRE_Match object at 
0x7f9d22693030>.group
Stacktrace
query_test/test_fetch.py:101: in test_rows_sent_counters
assert float(rows_sent_rate.group(1)) > 0
E   assert 0.0 > 0
E+  where 0.0 = float('0')
E+where '0' = (1)
E+  where  = <_sre.SRE_Match object at 0x7f9d22693030>.group
Standard Error
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
-- connecting to localhost:11050 with impyla
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
11050, 0, 0)

-- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of 
[('::1', 11050, 0, 0), ('127.0.0.1', 11050)]
-- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup 
failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), 
('127.0.0.1', 11050)]
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

select id from functional.alltypes limit 10;

-- 2021-09-10 04:41:56,988 INFO MainThread: Started query 
b04941a75e31:1da6c8eb
Options
{code}



  was:
RowsSentRate counter is seen with a value of 0 and the following is the error 
report. A fix in the similar area was described in IMPALA-8957, where a delay 
via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.

{code:java}
query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#206 )
Took 0.47 sec.
Error Message
query_test/test_fetch.py:101: in test_rows_sent_counters assert 
float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
float('0') E+where '0' = (1) E+  where  = <_sre.SRE_Match object at 
0x7f9d22693030>.group
Stacktrace
query_test/test_fetch.py:101: in test_rows_sent_counters
assert float(rows_sent_rate.group(1)) > 0
E   assert 0.0 > 0
E+  where 0.0 = float('0')
E+where '0' = (1)
E+  where  = <_sre.SRE_Match object at 0x7f9d22693030>.group
Standard Error
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
-- connecting to localhost:11050 with impyla
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
11050, 0, 0)
Traceback (most recent call last):
  File

[jira] [Updated] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen updated IMPALA-10927:

Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 
based test  (was: TestFetchAndSpooling.test_rows_sent_counters is flaky in 
impala-cdpd-master-staging-core-s3 based test)

> TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
> ---
>
> Key: IMPALA-10927
> URL: https://issues.apache.org/jira/browse/IMPALA-10927
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Priority: Major
>
> RowsSentRate counter is seen with a value of 0 and the following is the error 
> report. A fix in the similar area was described in IMPALA-8957, where a delay 
> via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.
> {code:java}
> query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] (from pytest)
> Failing for the past 1 build (Since Failed#206 )
> Took 0.47 sec.
> Error Message
> query_test/test_fetch.py:101: in test_rows_sent_counters assert 
> float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
> float('0') E+where '0' =  object at 0x7f9d22693030>(1) E+  where  _sre.SRE_Match object at 0x7f9d22693030> = <_sre.SRE_Match object at 
> 0x7f9d22693030>.group
> Stacktrace
> query_test/test_fetch.py:101: in test_rows_sent_counters
> assert float(rows_sent_rate.group(1)) > 0
> E   assert 0.0 > 0
> E+  where 0.0 = float('0')
> E+where '0' =  0x7f9d22693030>(1)
> E+  where  0x7f9d22693030> = <_sre.SRE_Match object at 0x7f9d22693030>.group
> Standard Error
> SET 
> client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
> -- connecting to localhost:11050 with impyla
> -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
> 11050, 0, 0)
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
>  line 104, in open
> handle.connect(sockaddr)
>   File 
> "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
>  line 228, in meth
> return getattr(self._sock,name)(*args)
> error: [Errno 111] Connection refused
> -- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to 
> ('127.0.0.1', 11050)
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
>  line 104, in open
> handle.connect(sockaddr)
>   File 
> "/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
>  line 228, in meth
> return getattr(self._sock,name)(*args)
> error: [Errno 111] Connection refused
> -- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of 
> [('::1', 11050, 0, 0), ('127.0.0.1', 11050)]
> -- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup 
> failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), 
> ('127.0.0.1', 11050)]
> SET 
> client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select id from functional.alltypes limit 10;
> -- 2021-09-10 04:41:56,988 INFO MainThread: Started query 
> b04941a75e31:1da6c8eb
> Options
> {code}

[jira] [Created] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test

2021-09-22 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-10927:
---

 Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in 
impala-cdpd-master-staging-core-s3 based test
 Key: IMPALA-10927
 URL: https://issues.apache.org/jira/browse/IMPALA-10927
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


RowsSentRate counter is seen with a value of 0 and the following is the error 
report. A fix in the similar area was described in IMPALA-8957, where a delay 
via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.

{code:java}
query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#206 )
Took 0.47 sec.
Error Message
query_test/test_fetch.py:101: in test_rows_sent_counters assert 
float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
float('0') E+where '0' = (1) E+  where  = <_sre.SRE_Match object at 
0x7f9d22693030>.group
Stacktrace
query_test/test_fetch.py:101: in test_rows_sent_counters
assert float(rows_sent_rate.group(1)) > 0
E   assert 0.0 > 0
E+  where 0.0 = float('0')
E+where '0' = (1)
E+  where  = <_sre.SRE_Match object at 0x7f9d22693030>.group
Standard Error
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
-- connecting to localhost:11050 with impyla
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
11050, 0, 0)
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
 line 104, in open
handle.connect(sockaddr)
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
 line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to 
('127.0.0.1', 11050)
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
 line 104, in open
handle.connect(sockaddr)
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
 line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
-- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of 
[('::1', 11050, 0, 0), ('127.0.0.1', 11050)]
-- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup 
failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), 
('127.0.0.1', 11050)]
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

select id from functional.alltypes limit 10;

-- 2021-09-10 04:41:56,988 INFO MainThread: Started query 
b04941a75e31:1da6c8eb
Options
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IMPALA-10927) TestFetchAndSpooling.test_rows_sent_counters is flaky in impala-cdpd-master-staging-core-s3 based test

2021-09-22 Thread Qifan Chen (Jira)

Qifan Chen created IMPALA-10927:
---

 Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky in 
impala-cdpd-master-staging-core-s3 based test
 Key: IMPALA-10927
 URL: https://issues.apache.org/jira/browse/IMPALA-10927
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


RowsSentRate counter is seen with a value of 0 and the following is the error 
report. A fix in the similar area was described in IMPALA-8957, where a delay 
via DEBUG_ACTION BPRS_BEFORE_ADD_BATCH was introduced.

{code:java}
query_test.test_fetch.TestFetchAndSpooling.test_rows_sent_counters[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#206 )
Took 0.47 sec.
Error Message
query_test/test_fetch.py:101: in test_rows_sent_counters assert 
float(rows_sent_rate.group(1)) > 0 E   assert 0.0 > 0 E+  where 0.0 = 
float('0') E+where '0' = (1) E+  where  = <_sre.SRE_Match object at 
0x7f9d22693030>.group
Stacktrace
query_test/test_fetch.py:101: in test_rows_sent_counters
assert float(rows_sent_rate.group(1)) > 0
E   assert 0.0 > 0
E+  where 0.0 = float('0')
E+where '0' = (1)
E+  where  = <_sre.SRE_Match object at 0x7f9d22693030>.group
Standard Error
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-09-10 04:41:56,852 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-09-10 04:41:56,925 INFO MainThread: Closing active operation
-- connecting to localhost:11050 with impyla
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to ('::1', 
11050, 0, 0)
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
 line 104, in open
handle.connect(sockaddr)
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
 line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
-- 2021-09-10 04:41:56,939 INFO MainThread: Could not connect to 
('127.0.0.1', 11050)
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p4/python/lib/python2.7/site-packages/thrift/transport/TSocket.py",
 line 104, in open
handle.connect(sockaddr)
  File 
"/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/socket.py",
 line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
-- 2021-09-10 04:41:56,939 ERRORMainThread: Could not connect to any of 
[('::1', 11050, 0, 0), ('127.0.0.1', 11050)]
-- 2021-09-10 04:41:56,939 INFO MainThread: HS2 FENG connection setup 
failed, continuing...: Could not connect to any of [('::1', 11050, 0, 0), 
('127.0.0.1', 11050)]
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET debug_action=BPRS_BEFORE_ADD_ROWS:SLEEP@1000;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

select id from functional.alltypes limit 10;

-- 2021-09-10 04:41:56,988 INFO MainThread: Started query 
b04941a75e31:1da6c8eb
Options
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

1 2 3 4 >

1 - 100 of 341 matches

Mail list logo