[jira] [Updated] (IMPALA-7106) Log the original and rewritten SQL when SQL rewrite fails

2019-05-29 Thread Quanlong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-7106:
---
Fix Version/s: Impala 2.13.0

> Log the original and rewritten SQL when SQL rewrite fails
> -
>
> Key: IMPALA-7106
> URL: https://issues.apache.org/jira/browse/IMPALA-7106
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The toSql() prints the the SQL that is close to the original SQL string which 
> makes sense for the users. However when debugging, i.e. log level set to 
> TRACE, it is useful to have the rewritten SQL printed out correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7132) run_clang_tidy.sh produces unrelated output

2019-05-29 Thread Quanlong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-7132:
---
Fix Version/s: Impala 2.13.0

> run_clang_tidy.sh produces unrelated output
> ---
>
> Key: IMPALA-7132
> URL: https://issues.apache.org/jira/browse/IMPALA-7132
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> bin/run_clang_tidy.sh uses Clang's run-clang-tidy.py script, and this is 
> producing a large amount of useless output like this:
> {noformat}
> New replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 7916:+0:"break; "
> Existing replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 
> 7916:+0:"FALLTHROUGH_INTENDED; "
> Fix conflicts with existing fix! The new insertion has the same insert 
> location as an existing replacement.
> New replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 7916:+0:"break; "
> Existing replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 
> 7916:+0:"FALLTHROUGH_INTENDED; "
> 1453 warnings generated.
> Suppressed 1455 warnings (3 in non-user code, 2 NOLINT, 1450 with check 
> filters).
> Use -header-filter=.* to display errors from all non-system headers. Use 
> -system-headers to display errors from system headers as well.{noformat}
> This happens over and over with no diagnostic utility. It is being written to 
> stderr, and it seems that nothing diagnostically useful is written to stderr.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7238) test_kudu.TestCreateExternalTable sees unique database already exists

2019-05-29 Thread Quanlong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-7238:
---
Fix Version/s: Impala 2.13.0

> test_kudu.TestCreateExternalTable sees unique database already exists
> -
>
> Key: IMPALA-7238
> URL: https://issues.apache.org/jira/browse/IMPALA-7238
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> All of the tests from query_test.test_kudu.TestCreateExternalTable fail with 
> an error like:
> {noformat}
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:704:
>  in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database 
> testcreateexternaltable_23808_vu8cqo already exists{noformat}
> It looks like the failures all happen at once in a single process. The first 
> test to fail is test_kudu.TestCreateExternalTable.test_col_types. It takes 52 
> seconds where all the other tests take no time. It also has an extra error on 
> stderr:
> {noformat}
> -- connecting to: localhost:21000
> MainThread: Failed to open transport (tries_left=3)
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 940, in _execute
> return func(request)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 265, in ExecuteStatement
> return self.recv_ExecuteStatement()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 276, in recv_ExecuteStatement
> (fname, mtype, rseqid) = self._iprot.readMessageBegin()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 126, in readMessageBegin
> sz = self.readI32()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 206, in readI32
> buff = self.trans.readAll(4)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 58, in readAll
> chunk = self.read(sz - have)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 159, in read
> self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py",
>  line 105, in read
> buff = self.handle.recv(sz)
> timeout: timed out
> MainThread: Error closing Impala cursor: Invalid session id: 
> f54064f9a4604f23:fb686144269fc8b1{noformat}
> The other failures don't have this.
> This happened only once, so it is definitely intermittent. This has some 
> similarity to IMPALA-6933, but this looks like a repeated failure in a single 
> process, not a concurrency issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7199) Need to have scripts to generate coverage

2019-05-29 Thread Quanlong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-7199:
---
Fix Version/s: Impala 2.13.0

> Need to have scripts to generate coverage
> -
>
> Key: IMPALA-7199
> URL: https://issues.apache.org/jira/browse/IMPALA-7199
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
> Attachments: coverage_be_tests.tar.gz
>
>
> Code coverage can be a useful means to verify that tests are exercising the 
> code as expected. It would be useful to have a helper script to make this 
> process as simple as possible. Now that gcovr is pip installable, that is one 
> option for generating coverage reports.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8594) Support drop table for external kudu tables that are dropped in kudu

2019-05-29 Thread Manish Maheshwari (JIRA)
Manish Maheshwari created IMPALA-8594:
-

 Summary: Support drop table for external kudu tables that are 
dropped in kudu
 Key: IMPALA-8594
 URL: https://issues.apache.org/jira/browse/IMPALA-8594
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: Manish Maheshwari


External kudu tables in Impala cannot be dropped from HMS if the kudu table is 
already dropped in kudu. This cases HMS to be out of sync with kudu metadata.

Impala should clean up HMS table info when a drop is executed for an external 
table that does not exist in kudu

 

cc - [~balazsj_impala_220b] [~tlipcon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)
Robbie Zhang created IMPALA-8595:


 Summary: THRIFT-3505 breaks IMPALA-5775
 Key: IMPALA-8595
 URL: https://issues.apache.org/jira/browse/IMPALA-8595
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.1.0
Reporter: Robbie Zhang


IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
**PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}

  

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned IMPALA-8595:


Assignee: Robbie Zhang

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
> **PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>   
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6903) Allow download of text profile via Impala WebUI

2019-05-29 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved IMPALA-6903.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Allow download of text profile via Impala WebUI
> ---
>
> Key: IMPALA-6903
> URL: https://issues.apache.org/jira/browse/IMPALA-6903
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 2.11.0
>Reporter: Gabor Kaszab
>Assignee: Yongzhi Chen
>Priority: Minor
>  Labels: newbie
> Fix For: Impala 3.3.0
>
> Attachments: download-text-profile-link-screenshot.png
>
>
> In Impala WebUI it's already possible to download the query profile in thrift 
> format (https://issues.apache.org/jira/browse/IMPALA-2555). It would be nice 
> to have the same download option for text format as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang updated IMPALA-8595:
-
Description: 
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 

  was:
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
**PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}

  

 


> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang updated IMPALA-8595:
-
Description: 
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 

  was:
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 


> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8596) TestObservability.test_global_exchange_counters failed in ASAN

2019-05-29 Thread JIRA
Zoltán Borók-Nagy created IMPALA-8596:
-

 Summary: TestObservability.test_global_exchange_counters failed in 
ASAN
 Key: IMPALA-8596
 URL: https://issues.apache.org/jira/browse/IMPALA-8596
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.3.0
Reporter: Zoltán Borók-Nagy


Seen in an ASAN build:
h3.  
{noformat}
Error Message
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Stacktrace
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Standard Error
-- executing against localhost:21000 select count(*) from tpch_parquet.orders o 
inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey group by 
o.o_clerk limit 10; -- 2019-05-28 05:24:17,072 INFO MainThread: Started query 
664116fde66bdd8c:4ca951da
{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8596) TestObservability.test_global_exchange_counters failed in ASAN

2019-05-29 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/IMPALA-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-8596:
--
Description: 
Seen in an ASAN build: 
{noformat}
Error Message
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Stacktrace
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Standard Error
-- executing against localhost:21000 select count(*) from tpch_parquet.orders o 
inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey group by 
o.o_clerk limit 10; -- 2019-05-28 05:24:17,072 INFO MainThread: Started query 
664116fde66bdd8c:4ca951da
{noformat}
 

  was:
Seen in an ASAN build:
h3.  
{noformat}
Error Message
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Stacktrace
query_test/test_observability.py:415: in test_global_exchange_counters assert m 
E assert None

Standard Error
-- executing against localhost:21000 select count(*) from tpch_parquet.orders o 
inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey group by 
o.o_clerk limit 10; -- 2019-05-28 05:24:17,072 INFO MainThread: Started query 
664116fde66bdd8c:4ca951da
{noformat}
 


> TestObservability.test_global_exchange_counters failed in ASAN
> --
>
> Key: IMPALA-8596
> URL: https://issues.apache.org/jira/browse/IMPALA-8596
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.3.0
>Reporter: Zoltán Borók-Nagy
>Priority: Blocker
>
> Seen in an ASAN build: 
> {noformat}
> Error Message
> query_test/test_observability.py:415: in test_global_exchange_counters assert 
> m E assert None
> Stacktrace
> query_test/test_observability.py:415: in test_global_exchange_counters assert 
> m E assert None
> Standard Error
> -- executing against localhost:21000 select count(*) from tpch_parquet.orders 
> o inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey group by 
> o.o_clerk limit 10; -- 2019-05-28 05:24:17,072 INFO MainThread: Started query 
> 664116fde66bdd8c:4ca951da
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8595.
---
Resolution: Duplicate

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850969#comment-16850969
 ] 

Tim Armstrong commented on IMPALA-8595:
---

See IMPALA-6990. The problem is that Thrift support for specifying TLSv1.2 
depends on a Python feature that was only added in python 2.7.9 - 
https://docs.python.org/2/library/ssl.html#ssl-contexts. I don't think there's 
much we can reasonably do to support TLS1.2 in older versions of Python.

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7106) Log the original and rewritten SQL when SQL rewrite fails

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850974#comment-16850974
 ] 

ASF subversion and git services commented on IMPALA-7106:
-

Commit 1ee17ee89a579e65c2fbc55a613ac368d01161a0 in impala's branch 
refs/heads/2.x from Fredy Wijaya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1ee17ee ]

IMPALA-7106: Log the original and rewritten SQL when SQL rewrite fails

toSql() method is used to print SQL string that is close to the original
SQL string when errors arise or as the result of "SHOW CREATE". When
debugging issues related to SQL rewrites, it can be very useful to be
able to get the SQL string that is being rewritten. This patch adds a
new method toSql(boolean rewritten) to get the rewritten SQL string. This
patch also logs the original and rewritten SQL when SQL rewrite fails.

Testing:
- Added FE test for the rewritten SQL string
- Ran all FE tests

Change-Id: Iab58b0cc865135d261dd4a7f72be130f2e7bde53
Reviewed-on: http://gerrit.cloudera.org:8080/10571
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Log the original and rewritten SQL when SQL rewrite fails
> -
>
> Key: IMPALA-7106
> URL: https://issues.apache.org/jira/browse/IMPALA-7106
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The toSql() prints the the SQL that is close to the original SQL string which 
> makes sense for the users. However when debugging, i.e. log level set to 
> TRACE, it is useful to have the rewritten SQL printed out correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7132) run_clang_tidy.sh produces unrelated output

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850975#comment-16850975
 ] 

ASF subversion and git services commented on IMPALA-7132:
-

Commit b41a780065b60e1e807099937f06dd328652cc87 in impala's branch 
refs/heads/2.x from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b41a780 ]

IMPALA-7132: Filter out useless output from run_clang_tidy.sh

Clang's run-clang-tidy.py script produces a lot of
output even when there are no warnings or errors.
None of this output is useful.

This patch has two parts:
1. Bump LLVM to 5.0.1-p1, which has patched run-clang-tidy.py
   to make it reduce its own output when passed -quiet
   (along with other enhancements).
2. Pass -quiet to run-clang-tidy.py and pipe the stderr output
   to a temporary file. Display this output only if
   run-clang-tidy.py hits an error, as this output is not
   useful otherwise.

Testing with a known clang tidy issue shows that warnings
and errors are still in the output, and the output is
clean on a clean Impala checkout.

Change-Id: I63c46a7d57295eba38fac8ab49c7a15d2802df1d
Reviewed-on: http://gerrit.cloudera.org:8080/10615
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> run_clang_tidy.sh produces unrelated output
> ---
>
> Key: IMPALA-7132
> URL: https://issues.apache.org/jira/browse/IMPALA-7132
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> bin/run_clang_tidy.sh uses Clang's run-clang-tidy.py script, and this is 
> producing a large amount of useless output like this:
> {noformat}
> New replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 7916:+0:"break; "
> Existing replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 
> 7916:+0:"FALLTHROUGH_INTENDED; "
> Fix conflicts with existing fix! The new insertion has the same insert 
> location as an existing replacement.
> New replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 7916:+0:"break; "
> Existing replacement: /home/ubuntu/Impala/be/src/runtime/types.h: 
> 7916:+0:"FALLTHROUGH_INTENDED; "
> 1453 warnings generated.
> Suppressed 1455 warnings (3 in non-user code, 2 NOLINT, 1450 with check 
> filters).
> Use -header-filter=.* to display errors from all non-system headers. Use 
> -system-headers to display errors from system headers as well.{noformat}
> This happens over and over with no diagnostic utility. It is being written to 
> stderr, and it seems that nothing diagnostically useful is written to stderr.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8560) Prometheus metrics support in Impala

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850979#comment-16850979
 ] 

ASF subversion and git services commented on IMPALA-8560:
-

Commit c2aeb93c4f5269e2a0ad2f027ef239767abd32dd in impala's branch 
refs/heads/master from Harshil
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c2aeb93 ]

IMPALA-8560: Prometheus metrics support in Impala

-- This change adds Prometheus text explosion format metric
   generation.
-- More details can be found below:
-- https://prometheus.io/docs/instrumenting/exposition_formats
-- Added unit test to test this change

Tests:
-- Feed all this metrics to prometheus running on local host
-- Also ran it against a "./promtool" to check for any error in
   metrics format for prometheus.
Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
Reviewed-on: http://gerrit.cloudera.org:8080/13345
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Prometheus metrics support in Impala
> 
>
> Key: IMPALA-8560
> URL: https://issues.apache.org/jira/browse/IMPALA-8560
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Harshil Shah
>Priority: Major
>
> -- This change adds Prometheus text explosion format metric generation in 
> impala.
> -- more details about text explosion can be found here: 
> [https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8585) Impala ACID tests

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850980#comment-16850980
 ] 

ASF subversion and git services commented on IMPALA-8585:
-

Commit 396d57709e147fd5ae9b692a225cc2174d59df6f in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=396d577 ]

IMPALA-8585: Add tests for partitioned ACID tables

Added e2e tests for partitioned ACID tables.

Added some unit tests for file filtering with open and
aborted write ids.

Increased 'hive.compactor.worker.threads' to 4 to make compactions
faster because they are terrible slow with only one thread.

Change-Id: I6732db306459621a11f67a7263e9e06748fa35a8
Reviewed-on: http://gerrit.cloudera.org:8080/13428
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Impala ACID tests
> -
>
> Key: IMPALA-8585
> URL: https://issues.apache.org/jira/browse/IMPALA-8585
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: impala-acid
>
> Umbrella Jira for adding tests about ACID functionality, e.g.:
>  * Ordinary table that was upgraded to ACID table
>  * Inserting data in hive and querying it in Impala concurrently
>  * Compute stats interoperability between Hive and Impala
>  * Partitioned tables, dynamic partitioning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7186) Docs for kudu_read_mode

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850978#comment-16850978
 ] 

ASF subversion and git services commented on IMPALA-7186:
-

Commit 2413c6c0b5a78ca4e8d918319f9986fa75eec66b in impala's branch 
refs/heads/2.x from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2413c6c ]

IMPALA-7186: [DOCS] Documented the KUDU_READ_MODE query option

Change-Id: I49b4ec29ae8cdbee8b3d38bdf2e678b4e9560952
Reviewed-on: http://gerrit.cloudera.org:8080/10897
Reviewed-by: Alex Rodoni 
Tested-by: Impala Public Jenkins 
Reviewed-on: http://gerrit.cloudera.org:8080/13455


> Docs for kudu_read_mode
> ---
>
> Key: IMPALA-7186
> URL: https://issues.apache.org/jira/browse/IMPALA-7186
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> IMPALA-6812 added a new query option, KUDU_READ_MODE, which should be 
> documented with something like:
> KUDU_READ_MODE Query Option
> This query option allows users to set a desired consistency level for scans 
> of Kudu tables. Possible values are DEFAULT, READ_LATEST, and 
> READ_AT_SNAPSHOT. If DEFAULT is specified, the value of the startup flag 
> '--kudu_read_mode' will be used.
> READ_LATEST
> Kudu provides no consistency guarantees for this mode, expect that all 
> returned rows were committed at some point, sometimes known as 'Read 
> Committed' isolation.
> READ_AT_SNAPSHOT
> Kudu will take a snapshot of the current state of the data and perform the 
> scan over the snapshot, possibly after briefly waiting for ongoing writes to 
> complete. This provides "Read Your Writes" consistency within a single Impala 
> session, except in the case of a Kudu leader change. See the Kudu 
> documentation for more details.
> Type: string
> Default: DEFAULT
> Added in: Impala 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7199) Need to have scripts to generate coverage

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850977#comment-16850977
 ] 

ASF subversion and git services commented on IMPALA-7199:
-

Commit 1e946fbbea0d15e4022088e57f162cc81015ac15 in impala's branch 
refs/heads/2.x from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1e946fb ]

IMPALA-7199: Add scripts to create code coverage reports

gcovr is a python library that uses gcov to generate
code coverage reports. This adds gcovr to the python
dependencies and adds bin/impala-gcovr to provide
easy access to gcovr's command line. gcovr 3.4
supports python 2.6+.

This also adds bin/coverage_helper.sh to provide a
simplified interface to generate reports and zero
coverage counters.

Code coverage data is written out when a program
exits, so it is important to avoid hard kills
to shut down the impalads when generating coverage.
This modifies testdata/bin/kill-all.sh to call
start-impala-cluster.py --kill when shutting down
the minicluster to try to avoid doing a hard kill.
It will still do a hard kill if impala is still
running after the softer kill.

Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd
Reviewed-on: http://gerrit.cloudera.org:8080/10791
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Need to have scripts to generate coverage
> -
>
> Key: IMPALA-7199
> URL: https://issues.apache.org/jira/browse/IMPALA-7199
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
> Attachments: coverage_be_tests.tar.gz
>
>
> Code coverage can be a useful means to verify that tests are exercising the 
> code as expected. It would be useful to have a helper script to make this 
> process as simple as possible. Now that gcovr is pip installable, that is one 
> option for generating coverage reports.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8546) Collect logs from docker containers in tests

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850981#comment-16850981
 ] 

ASF subversion and git services commented on IMPALA-8546:
-

Commit 7ea9a949259a7ab600eb9d7de888d22ef4ecd6b9 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7ea9a94 ]

IMPALA-8546: collect logs from docker containers

This modifies containers to put logs in /opt/impala/logs,
then mounts that directory to
$IMPALA_HOME/logs/.../ so that logs will
be collected on the host and scooped up by jenkins jobs.

The layout of the log directory is a little different to
the non-dockerised containers because I wanted to avoid
sharing log directories between containers.

Change-Id: I24bcaa521882d450d43d1f2ca34767e7ce36bbd2
Reviewed-on: http://gerrit.cloudera.org:8080/13393
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Collect logs from docker containers in tests
> 
>
> Key: IMPALA-8546
> URL: https://issues.apache.org/jira/browse/IMPALA-8546
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> We should collect the logs from the cluster processes into the logs/ 
> subdirectory for debugging purposes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7238) test_kudu.TestCreateExternalTable sees unique database already exists

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850976#comment-16850976
 ] 

ASF subversion and git services commented on IMPALA-7238:
-

Commit 68caba1313a1380d9d99acf7aed209b62e43dbb6 in impala's branch 
refs/heads/2.x from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=68caba1 ]

IMPALA-7238: Use custom timeout for create unique database

test_kudu.TestCreateExternalTables() saw a timeout when
creating the unique database for its tests.

__unique_conn() opens a connection, creates a unique database,
then returns another connection in that database. It takes
a custom timeout argument, but the timeout is only for the
returned connection. The first connection to create the
unique database uses the default timeout of 45 seconds.

This patch changes the first connection to use the custom
timeout. For Kudu tests, this is 5 minutes rather than 45
seconds.

Change-Id: I4f2beb5bc027a4bb44e854bf1dd8919807a92ea0
Reviewed-on: http://gerrit.cloudera.org:8080/10862
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> test_kudu.TestCreateExternalTable sees unique database already exists
> -
>
> Key: IMPALA-7238
> URL: https://issues.apache.org/jira/browse/IMPALA-7238
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> All of the tests from query_test.test_kudu.TestCreateExternalTable fail with 
> an error like:
> {noformat}
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:704:
>  in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database 
> testcreateexternaltable_23808_vu8cqo already exists{noformat}
> It looks like the failures all happen at once in a single process. The first 
> test to fail is test_kudu.TestCreateExternalTable.test_col_types. It takes 52 
> seconds where all the other tests take no time. It also has an extra error on 
> stderr:
> {noformat}
> -- connecting to: localhost:21000
> MainThread: Failed to open transport (tries_left=3)
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 940, in _execute
> return func(request)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 265, in ExecuteStatement
> return self.recv_ExecuteStatement()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 276, in recv_ExecuteStatement
> (fname, mtype, rseqid) = self._iprot.readMessageBegin()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 126, in readMessageBegin
> sz = self.readI32()
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 206, in readI32
> buff = self.trans.readAll(4)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 58, in readAll
> chunk = self.read(sz - have)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 159, in read
> self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py",
>  line 105, in read
> buff = self.handle.recv(sz)
> timeout: timed out
> MainThread: Error closing Impala cursor: Invalid session id: 
> f54064f9a4604f23:fb686144269fc8b1{noformat}
> The other failures don't have this.
> This happened only once, so it is definitely intermittent. This has some 
> similarity to IMPALA-6933, but this looks like a repeated failure in a single 
> process, not a concurrency issue.



--
This message was sent b

[jira] [Resolved] (IMPALA-8546) Collect logs from docker containers in tests

2019-05-29 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8546.
---
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Collect logs from docker containers in tests
> 
>
> Key: IMPALA-8546
> URL: https://issues.apache.org/jira/browse/IMPALA-8546
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> We should collect the logs from the cluster processes into the logs/ 
> subdirectory for debugging purposes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reopened IMPALA-8595:
---

Sorry just saw you had a patch. Will leave it open in case there's a workaround.


> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8594) Support drop table for external kudu tables that are dropped in kudu

2019-05-29 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851017#comment-16851017
 ] 

Todd Lipcon commented on IMPALA-8594:
-

Did you hit this with LocalCatalog enabled or without? IMPALA-8459 is a known 
issue with LocalCatalog that's on my todo list to address, but didn't think 
this was an issue for Catalog V1

> Support drop table for external kudu tables that are dropped in kudu
> 
>
> Key: IMPALA-8594
> URL: https://issues.apache.org/jira/browse/IMPALA-8594
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Manish Maheshwari
>Priority: Critical
>
> External kudu tables in Impala cannot be dropped from HMS if the kudu table 
> is already dropped in kudu. This cases HMS to be out of sync with kudu 
> metadata.
> Impala should clean up HMS table info when a drop is executed for an external 
> table that does not exist in kudu
>  
> cc - [~balazsj_impala_220b] [~tlipcon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8577) Crash during OpenSSLSocket.read

2019-05-29 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851058#comment-16851058
 ] 

Sahil Takiar commented on IMPALA-8577:
--

I can actually re-produce this pretty easily. Just a single threaded TPC-DS 
execution on an ASAN build on the first iteration. I'm going to play around 
with netty-tcnative and see if it produces the error as well. It's possible 
there could be bug in the AWS SDK that is causing this; if that is the case 
then the crash should re-produce with netty-tcnative as well.

> Crash during OpenSSLSocket.read
> ---
>
> Key: IMPALA-8577
> URL: https://issues.apache.org/jira/browse/IMPALA-8577
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: David Rorke
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: 5ca78771-ad78-4a29-31f88aa6-9bfac38c.dmp, 
> hs_err_pid6313.log, 
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.ERROR.20190521-103105.6313,
>  
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.INFO.20190521-103105.6313
>
>
> Impalad crashed while running a TPC-DS 10 TB run against S3.   Excerpt from 
> the stack trace (hs_err log file attached with more complete stack):
> {noformat}
> Stack: [0x7f3d095bc000,0x7f3d09dbc000],  sp=0x7f3d09db9050,  free 
> space=8180k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [impalad+0x2528a33]  
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
>  unsigned long, int)+0x133
> C  [impalad+0x2528e0f]  tcmalloc::ThreadCache::Scavenge()+0x3f
> C  [impalad+0x266468a]  operator delete(void*)+0x32a
> C  [libcrypto.so.10+0x6e70d]  CRYPTO_free+0x1d
> J 5709  org.wildfly.openssl.SSLImpl.freeBIO0(J)V (0 bytes) @ 
> 0x7f3d4dadf9f9 [0x7f3d4dadf940+0xb9]
> J 5708 C1 org.wildfly.openssl.SSLImpl.freeBIO(J)V (5 bytes) @ 
> 0x7f3d4dfd0dfc [0x7f3d4dfd0d80+0x7c]
> J 5158 C1 org.wildfly.openssl.OpenSSLEngine.shutdown()V (78 bytes) @ 
> 0x7f3d4de4fe2c [0x7f3d4de4f720+0x70c]
> J 5758 C1 org.wildfly.openssl.OpenSSLEngine.closeInbound()V (51 bytes) @ 
> 0x7f3d4de419cc [0x7f3d4de417c0+0x20c]
> J 2994 C2 
> org.wildfly.openssl.OpenSSLEngine.unwrap(Ljava/nio/ByteBuffer;[Ljava/nio/ByteBuffer;II)Ljavax/net/ssl/SSLEngineResult;
>  (892 bytes) @ 0x7f3d4db8da34 [0x7f3d4db8c900+0x1134]
> J 3161 C2 org.wildfly.openssl.OpenSSLSocket.read([BII)I (810 bytes) @ 
> 0x7f3d4dd64cb0 [0x7f3d4dd646c0+0x5f0]
> J 5090 C2 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer()I
>  (97 bytes) @ 0x7f3d4ddd9ee0 [0x7f3d4ddd9e40+0xa0]
> J 5846 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.fillInputBuffer(I)I
>  (48 bytes) @ 0x7f3d4d7acb24 [0x7f3d4d7ac7a0+0x384]
> J 5845 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.isStale()Z (31 
> bytes) @ 0x7f3d4d7ad49c [0x7f3d4d7ad220+0x27c]
> {noformat}
> The crash may not be easy to reproduce.  I've run this test multiple times 
> and only crashed once.   I have a core file if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8435) Prohibit unsupported operations on transactional tables

2019-05-29 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-8435.
-
   Resolution: Done
Fix Version/s: Impala 3.3.0

> Prohibit unsupported operations on transactional tables
> ---
>
> Key: IMPALA-8435
> URL: https://issues.apache.org/jira/browse/IMPALA-8435
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Sudhanshu Arora
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: impala-acid
> Fix For: Impala 3.3.0
>
>
> For a transactional table prohibit unsupported statements like any access to 
> full-ACID table, compute stats, alter, or write to insert only transactional 
> table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8597) Improve session maintenance timing logic

2019-05-29 Thread Thomas Tauber-Marshall (JIRA)
Thomas Tauber-Marshall created IMPALA-8597:
--

 Summary: Improve session maintenance timing logic
 Key: IMPALA-8597
 URL: https://issues.apache.org/jira/browse/IMPALA-8597
 Project: IMPALA
  Issue Type: Improvement
Reporter: Thomas Tauber-Marshall


Currently, the coordinator maintains a list of the timeout lengths for all 
sessions that have an idle_session_timeout set. The original intention of this 
was to have the thread that checks for timeouts wake up at an interval of  / 2, but this resulted in IMPALA-5108

The fix for that bug changed the session maintenance thread wake up every 1 
second if any timeout is registered, but we still maintain the list of timeout 
values even though only the length of the list is ever used.

Given that the default config is for there to be no session timeouts and that 
the maintenance thread is somewhat inefficient in holding the 
session_state_map_ lock for almost its entire execution, we may want to keep 
the behavior of only waking up once per second if there are any registered 
timeouts, in which case it would be more efficient to just maintain a count of 
timeouts instead of the list.

Or, we may want to just simplify the logic and have the thread always wake up 
once per second, without tracking the registered timeouts at all (esp. with the 
new work in IMPALA-1653 which adds closing of disconnected sessions to the 
maintenance thread), in which case we might want to consider ways to avoid 
holding the session_state_map_ lock for so long, eg. by sharding it the way we 
did with the client_request_state_map_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8577) Crash during OpenSSLSocket.read

2019-05-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851329#comment-16851329
 ] 

Da Zhou commented on IMPALA-8577:
-

no I didn't observed this issue on ABFS side.

The only issue I found so far is that Wildfly-OpenSSL currently doesn't support 
Server Name Indication(SNI), but this would be fixed soon: 
[https://github.com/wildfly/wildfly-openssl/issues/59]

> Crash during OpenSSLSocket.read
> ---
>
> Key: IMPALA-8577
> URL: https://issues.apache.org/jira/browse/IMPALA-8577
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: David Rorke
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: 5ca78771-ad78-4a29-31f88aa6-9bfac38c.dmp, 
> hs_err_pid6313.log, 
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.ERROR.20190521-103105.6313,
>  
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.INFO.20190521-103105.6313
>
>
> Impalad crashed while running a TPC-DS 10 TB run against S3.   Excerpt from 
> the stack trace (hs_err log file attached with more complete stack):
> {noformat}
> Stack: [0x7f3d095bc000,0x7f3d09dbc000],  sp=0x7f3d09db9050,  free 
> space=8180k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [impalad+0x2528a33]  
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
>  unsigned long, int)+0x133
> C  [impalad+0x2528e0f]  tcmalloc::ThreadCache::Scavenge()+0x3f
> C  [impalad+0x266468a]  operator delete(void*)+0x32a
> C  [libcrypto.so.10+0x6e70d]  CRYPTO_free+0x1d
> J 5709  org.wildfly.openssl.SSLImpl.freeBIO0(J)V (0 bytes) @ 
> 0x7f3d4dadf9f9 [0x7f3d4dadf940+0xb9]
> J 5708 C1 org.wildfly.openssl.SSLImpl.freeBIO(J)V (5 bytes) @ 
> 0x7f3d4dfd0dfc [0x7f3d4dfd0d80+0x7c]
> J 5158 C1 org.wildfly.openssl.OpenSSLEngine.shutdown()V (78 bytes) @ 
> 0x7f3d4de4fe2c [0x7f3d4de4f720+0x70c]
> J 5758 C1 org.wildfly.openssl.OpenSSLEngine.closeInbound()V (51 bytes) @ 
> 0x7f3d4de419cc [0x7f3d4de417c0+0x20c]
> J 2994 C2 
> org.wildfly.openssl.OpenSSLEngine.unwrap(Ljava/nio/ByteBuffer;[Ljava/nio/ByteBuffer;II)Ljavax/net/ssl/SSLEngineResult;
>  (892 bytes) @ 0x7f3d4db8da34 [0x7f3d4db8c900+0x1134]
> J 3161 C2 org.wildfly.openssl.OpenSSLSocket.read([BII)I (810 bytes) @ 
> 0x7f3d4dd64cb0 [0x7f3d4dd646c0+0x5f0]
> J 5090 C2 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer()I
>  (97 bytes) @ 0x7f3d4ddd9ee0 [0x7f3d4ddd9e40+0xa0]
> J 5846 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.fillInputBuffer(I)I
>  (48 bytes) @ 0x7f3d4d7acb24 [0x7f3d4d7ac7a0+0x384]
> J 5845 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.isStale()Z (31 
> bytes) @ 0x7f3d4d7ad49c [0x7f3d4d7ad220+0x27c]
> {noformat}
> The crash may not be easy to reproduce.  I've run this test multiple times 
> and only crashed once.   I have a core file if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8598) Impala/Atlas integration

2019-05-29 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-8598:


 Summary: Impala/Atlas integration
 Key: IMPALA-8598
 URL: https://issues.apache.org/jira/browse/IMPALA-8598
 Project: IMPALA
  Issue Type: Epic
  Components: Backend, Catalog, Frontend
Reporter: Fredy Wijaya


Impala needs to be able to provide an Atlas integration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8564) Add table's createTime information in the column lineage information

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8564:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Add table's createTime information in the column lineage information
> 
>
> Key: IMPALA-8564
> URL: https://issues.apache.org/jira/browse/IMPALA-8564
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> This is needed for https://issues.apache.org/jira/browse/ATLAS-3080



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8574) Review test coverage for query hook feature

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8574:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Review test coverage for query hook feature
> ---
>
> Key: IMPALA-8574
> URL: https://issues.apache.org/jira/browse/IMPALA-8574
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8573) Implement timeout for query hook execution

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8573:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Implement timeout for query hook execution
> --
>
> Key: IMPALA-8573
> URL: https://issues.apache.org/jira/browse/IMPALA-8573
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8473:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Implement a plugin approach (similar to {{authorization_provider}}) for 
> consuming query event hooks, where downstream users can provide their own 
> hook implementations as runtime dependencies.
> Keep but deprecate existing lineage event file writing.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8589) Fix flaky query event hook tests

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8589:
-
Issue Type: Sub-task  (was: Bug)
Parent: IMPALA-8598

> Fix flaky query event hook tests
> 
>
> Key: IMPALA-8589
> URL: https://issues.apache.org/jira/browse/IMPALA-8589
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: radford nguyen
>Priority: Major
>
> The test_query_event_hooks.py tests is flaky. We need to figure out a way to 
> deflake it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8572) Move query hook execution to before query unregistration

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8572:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Move query hook execution to before query unregistration
> 
>
> Key: IMPALA-8572
> URL: https://issues.apache.org/jira/browse/IMPALA-8572
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8571) Make query-hook-execution more robust and observable

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8571:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Make query-hook-execution more robust and observable
> 
>
> Key: IMPALA-8571
> URL: https://issues.apache.org/jira/browse/IMPALA-8571
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-8473.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Fix For: Impala 3.3.0
>
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Implement a plugin approach (similar to {{authorization_provider}}) for 
> consuming query event hooks, where downstream users can provide their own 
> hook implementations as runtime dependencies.
> Keep but deprecate existing lineage event file writing.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8599) Create a separate Maven module for query event hook API

2019-05-29 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-8599:


 Summary: Create a separate Maven module for query event hook API
 Key: IMPALA-8599
 URL: https://issues.apache.org/jira/browse/IMPALA-8599
 Project: IMPALA
  Issue Type: Sub-task
  Components: Frontend
Reporter: Fredy Wijaya






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8599) Create a separate Maven module for query event hook API

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8599:
-
Description: Impala needs to publish this API into Maven Central so that 
Atlas can consume it.

> Create a separate Maven module for query event hook API
> ---
>
> Key: IMPALA-8599
> URL: https://issues.apache.org/jira/browse/IMPALA-8599
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Fredy Wijaya
>Priority: Major
>
> Impala needs to publish this API into Maven Central so that Atlas can consume 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8576) Pass lineage object instead of string to query hook

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8576:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Pass lineage object instead of string to query hook
> ---
>
> Key: IMPALA-8576
> URL: https://issues.apache.org/jira/browse/IMPALA-8576
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5555) Add timeline event to query profile to indicate that it is finished

2019-05-29 Thread Ethan (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan resolved IMPALA-.
---
Resolution: Fixed

> Add timeline event to query profile to indicate that it is finished
> ---
>
> Key: IMPALA-
> URL: https://issues.apache.org/jira/browse/IMPALA-
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Jenny Kim
>Assignee: Ethan
>Priority: Major
>  Labels: observability, supportability
>
> Hue added a feature where after a user runs a query in Impala, we check the 
> Query Profile (from the ImpalaD Web UI) for the RowsProduced statistic (from 
> the Coordinator Fragment) and report that back as the total rows returned.
> We're noticing that for some long running queries, the RowsProduced will be 
> incorrect (reporting 4 despite getting 198 rows) *right* after the query is 
> complete, but will be correct a few seconds later (validated by checking the 
> query profile manually). We discovered that by adding a latency of a few 
> seconds, we can usually get the correct RowsProduced.
> But I was wondering if there's something smarter we can do, by checking 
> either a value in the query profile itself, or somewhere else. We tried 
> checking the hasResults value on the Thrift result handle as well as the 
> status of the operation handle, but unfortunately these don't seem to have 
> any effect (i.e. - they can be True or SUCCESSFUL even though the query 
> profile doesn't have the right RowsProduced number).
> Can something be added to the Query Profile itself to indicate that the 
> RowsProduced is correct?
> EDIT: Even though the original intent to guarantee that the value for 
> RowsProduced was final by relying on profile finalization is not the right 
> way to go as documented in the discussion below; it still makes sense to add 
> a profile finalization counter to indicate that the final update has been 
> recieved from the last fragment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException with local catalog

2019-05-29 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851376#comment-16851376
 ] 

Todd Lipcon commented on IMPALA-8489:
-

Ah, it seems this is due to --hms_event_polling_interval_s=1 rather than local 
catalog (I can repro with polling enabled, but if I turn off polling and keep 
localcatalog, it passes). Taking a look.

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> with local catalog
> --
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7294) TABLESAMPLE clause allocates arrays based on total file count instead of selected partitions

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851392#comment-16851392
 ] 

ASF subversion and git services commented on IMPALA-7294:
-

Commit 9f372f0d5df734c682c4c4cb237fc17e3e3f7bb0 in impala's branch 
refs/heads/2.x from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f372f0 ]

IMPALA-7294. TABLESAMPLE should not allocate array based on total table file 
count

This changes HdfsTable.getFilesSample() to allocate its intermediate
sampling array based on the number of files in the selected
(post-pruning) partitions, rather than the total number of files in the
table. While the former behavior was correct (the total file count is of
course an upper bound on the pruned file count), it was an unnecessarily
large allocation, which has some downsides around garbage collection.

In addition, this is important for the LocalCatalog implementation of
table sampling, since we do not want to have to load all partition file
lists in order to compute a sample over a pruned subset of partitions.

The original code indicated that this was an optimization to avoid
looping over the partition list an extra time. However, typical
partition lists are relatively small even in the worst case (order of
100k) and looping over 100k in-memory Java objects is not likely to be
the bottleneck in planning any query. This is especially true
considering that we loop over that same list later in the function
anyway, so we probably aren't saving page faults or LLC cache misses
either.

In testing this change I noticed that the existing test for TABLESAMPLE
didn't test TABLESAMPLE when applied in conjunction with a predicate.
I added a new dimension to the test which employs a predicate which
prunes some partitions to ensure that the code works in that case.
I also added coverage of the "100%" sampling parameter as a sanity check
that it returns the same results as a non-sampled query.

Change-Id: I0248d89bcd9dd4ff8b4b85fef282c19e3fe9bdd5
Reviewed-on: http://gerrit.cloudera.org:8080/10936
Reviewed-by: Philip Zeyliger 
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> TABLESAMPLE clause allocates arrays based on total file count instead of 
> selected partitions
> 
>
> Key: IMPALA-7294
> URL: https://issues.apache.org/jira/browse/IMPALA-7294
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> The HdfsTable.getFilesSample function takes a list of input partitions to 
> sample files from, but then, when allocating an array to sample into, sizes 
> that array based on the total file count across all partitions. This is an 
> unnecessarily large array, which is expensive to allocate (may cause full GC 
> when the heap is fragmented). The code claims this to be an optimization:
> {code}
> // Use max size to avoid looping over inputParts for the exact size.
> {code}
> ...but I think the loop over inputParts is likely to be trivial here since 
> we'll loop over them anyway later in the function and thus will already be 
> pulled into CPU cache, etc. This is also necessary for fine-grained metadata 
> loading in the impalad -- for a large table with many partitions, we don't 
> want to load the file lists of all partitions just to tablesample from one 
> partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7014) Disable stacktrace symbolisation by default

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851393#comment-16851393
 ] 

ASF subversion and git services commented on IMPALA-7014:
-

Commit 10a65efd5d14b3368718bc9ff394833921d91f9b in impala's branch 
refs/heads/2.x from Zoram Thanga
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=10a65ef ]

IMPALA-7014: Disable stacktrace symbolisation by default

Stacktrace symbolization has been shown to be 2500x slower
compared to just printing the un-symbolized one.

This has burned us a few times now, so let's disable it by
default.

Change-Id: If3af209890ccc242beb742145c63eb6836d4bfbb
Reviewed-on: http://gerrit.cloudera.org:8080/10964
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Disable stacktrace symbolisation by default
> ---
>
> Key: IMPALA-7014
> URL: https://issues.apache.org/jira/browse/IMPALA-7014
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Not Applicable
>Reporter: Tim Armstrong
>Assignee: Zoram Thanga
>Priority: Critical
> Fix For: Impala 3.1.0
>
>
> We got burned by the code of producing stacktrace again with IMPALA-6996. I 
> did a quick investigation into this, based on the hypothesis that the 
> symbolisation was the expensive part, rather than getting the addresses. I 
> added a stopwatch to GetStackTrace() to measure the time in nanoseconds and 
> ran a test that produces a backtrace
> The first experiment was 
> {noformat}
> $ start-impala-cluster.py --impalad_args='--symbolize_stacktrace=true' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:45:11.897944 30904 debug-util.cc:283] stacktrace time: 75175573
> I0511 09:45:11.897956 30904 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_308108.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef  impala::Status::Status()
> @  0x2cbe96f  
> impala::ParquetMetadataUtils::ValidateRowGroupColumn()
> @  0x205f597  impala::BaseScalarColumnReader::Reset()
> @  0x1feebe6  impala::HdfsParquetScanner::InitScalarColumns()
> @  0x1fe6ff3  impala::HdfsParquetScanner::NextRowGroup()
> @  0x1fe58d8  impala::HdfsParquetScanner::GetNextInternal()
> @  0x1fe3eea  impala::HdfsParquetScanner::ProcessSplit()
> @  0x1f6ba36  impala::HdfsScanNode::ProcessSplit()
> @  0x1f6adc4  impala::HdfsScanNode::ScannerThread()
> @  0x1f6a1c4  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x1f6c2a6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebecd5  impala::Thread::SuperviseThread()
> @  0x1ec6e71  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d95  boost::_bi::bind_t<>::operator()()
> @  0x1ec6d58  boost::detail::thread_data<>::run()
> @  0x31b3ada  thread_proxy
> @ 0x7f9be67d36ba  start_thread
> @ 0x7f9be650941d  clone
> {noformat}
> The stacktrace took 75ms, which is pretty bad! It would be worse on a 
> production system with more memory maps.
> The next experiment was to disable it:
> {noformat}
> start-impala-cluster.py --impalad_args='--symbolize_stacktrace=false' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:43:47.574185 29514 debug-util.cc:283] stacktrace time: 29528
> I0511 09:43:47.574193 29514 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_cb5d0225.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef
> @  0x2cbe96f
> @  0x205f597
> @  0x1feebe6
> @  0x1fe6ff3
> @  0x1fe58d8
> @  0x1fe3eea
> @  0x1f6ba36
> @  0x1f6adc4
> @  0x1f6a1c4
> @  0x1f6c2a6
> @  0x1bd3b1a
> @  0x1ebecd5
> @  0x1ec6e71
> @  0x1ec6d95
> @  0x1ec6d58
> @  0x31b3ada
> @ 0x7fbdcbdef6ba
> @ 0x7fbdcbb2541d
> {noformat}
> That's 2545x faster! If the addresses are in the statically linked binary, we 
> can use addrline to get back the line numbers:
> {noformat}
> $ addr2line -e be/build/latest/service/impalad 0x2cbe96f
> /home/tarmstrong/Impala/incubator-impala/be/src/exec/parquet-metadata-utils.cc:166
> {noformat}



--
This messag

[jira] [Commented] (IMPALA-7059) Inconsistent privilege model between DESCRIBE and DESCRIBE DATABASE

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851394#comment-16851394
 ] 

ASF subversion and git services commented on IMPALA-7059:
-

Commit ac13cb667edff8d9b91d8bf48940dbe5b2eb19db in impala's branch 
refs/heads/2.x from Fredy Wijaya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ac13cb6 ]

IMPALA-7059: Inconsistent privilege between DESCRIBE and DESCRIBE DATABASE

In DESCRIBE DATABASE, having VIEW_METADATA privilege allows seeing the
metadata information on the target database. Similarly, other SQL show
commands require VIEW_METADATA privilege on the target database/table.
In the prior code, DESCRIBE requires SELECT privilege on the target table
and is inconsistent with the rest of other SQL metadata commands. The
patch fixes the inconsistency by requiring DESCRIBE to use VIEW_METADATA
privilege.

Testing:
- Updated authorization tests
- Ran all FE tests

Change-Id: I37d1610a922741a6c95059c3beb7d04eb507783f
Reviewed-on: http://gerrit.cloudera.org:8080/10923
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Inconsistent privilege model between DESCRIBE and DESCRIBE DATABASE
> ---
>
> Key: IMPALA-7059
> URL: https://issues.apache.org/jira/browse/IMPALA-7059
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>  Labels: security
> Fix For: Impala 3.1.0
>
>
> {noformat}
> [localhost:21000] default> grant insert on database functional to role 
> foo_role; 
>   
>   
> Query: grant insert on database functional to role foo_role
> Query submitted at: 2018-07-11 11:05:51 (Coordinator: 
> http://fwijaya-impala-dev3.vpc.cloudera.com:25000)
> Query progress can be monitored at: 
> http://fwijaya-impala-dev3.vpc.cloudera.com:25000/query_plan?query_id=4a45c33de497d745:8242abc4
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.13s
> [localhost:21000] default> describe functional.alltypes;
> Query: describe functional.alltypes
> Fetched 0 row(s) in 5.45s
> [localhost:21000] default> describe database functional;
> Query: describe database functional
> ++-+-+
> | name   | location| comment |
> ++-+-+
> | functional | hdfs://localhost:20500/test-warehouse/functional.db | |
> ++-+-+
> {noformat}
> For consistency with privileges on other metadata SQL statements, such as 
> DESCRIBE DATABASE, we should register VIEW_METADATA privilege instead of just 
> SELECT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7096) Ensure no memory limit exceeded regressions from IMPALA-4835 because of non-reserved memory

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851391#comment-16851391
 ] 

ASF subversion and git services commented on IMPALA-7096:
-

Commit 537a4646dc1bf0a204354f23ed0905674c71eae5 in impala's branch 
refs/heads/2.x from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=537a464 ]

cleanup: extract RowBatchQueue into its own file

While looking at IMPALA-7096, I noticed that RowBatchQueue was
implemented in a strange place.

Change-Id: I3577c1c6920b8cf858c8d49f8812ccc305d833f6
Reviewed-on: http://gerrit.cloudera.org:8080/10943
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Ensure no memory limit exceeded regressions from IMPALA-4835 because of 
> non-reserved memory
> ---
>
> Key: IMPALA-7096
> URL: https://issues.apache.org/jira/browse/IMPALA-7096
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: resource-management
> Fix For: Impala 3.1.0
>
> Attachments: ScanConsumingMostMemory.txt
>
>
> IMPALA-7078 showed some cases where non-buffer memory could accumulate in the 
> row batch queue and cause memory consumption problems.
> The decision for whether to spin up a scanner thread in IMPALA-4835 
> implicitly assumes that buffer memory is the bulk of memory consumed by a 
> scan, but there may be cases where that is not true and the previous 
> heuristic would be more conservative about starting a scanner thread.
> We should investigate this further and figure out how to avoid it if there's 
> an issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8049) Impala Doc: Document Apache Ranger authorization provider

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851396#comment-16851396
 ] 

ASF subversion and git services commented on IMPALA-8049:
-

Commit a7b8c1e9574afba4385d4518713e412bdeaedb8c in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a7b8c1e ]

IMPALA-8049: [DOCS] Ranger authz support in impala

Change-Id: I4858bc49c1ed6d5e65ddbaebc96e56427446bad6
Reviewed-on: http://gerrit.cloudera.org:8080/13368
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> Impala Doc: Document Apache Ranger authorization provider
> -
>
> Key: IMPALA-8049
> URL: https://issues.apache.org/jira/browse/IMPALA-8049
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Critical
>  Labels: future_release_doc, in_33
>
> https://gerrit.cloudera.org/#/c/13368/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8447) Impala Doc: Document the feature to detect insert events from Impala

2019-05-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851395#comment-16851395
 ] 

ASF subversion and git services commented on IMPALA-8447:
-

Commit 879357c8a2924a22ca695c7ad609c11d48bc717e in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=879357c ]

IMPALA-8447: [DOCS] INSERT event is supported in automatic invalidation

- Added the INSERT events to the supported events.
- Noted the limitation with inserts from SparkSQL.

Change-Id: I68133b0beeb15cacc73829b8a8b0838fc7f4b7d8
Reviewed-on: http://gerrit.cloudera.org:8080/13300
Tested-by: Impala Public Jenkins 
Reviewed-by: Vihang Karajgaonkar 


> Impala Doc: Document the feature to detect insert events from Impala
> 
>
> Key: IMPALA-8447
> URL: https://issues.apache.org/jira/browse/IMPALA-8447
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>
> https://gerrit.cloudera.org/#/c/13300/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException when HMS polling is enabled

2019-05-29 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8489:

Summary: TestRecoverPartitions.test_post_invalidate fails with 
IllegalStateException when HMS polling is enabled  (was: 
TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
with local catalog)

I can reproduce this by just enabling polling (and not LocalCatalog). Updated 
the title appropriately.

The issue seems to be in CatalogOpExecutor.updateCatalog handling of partitions 
that were touched by an insert. It comes up with a list of partition IDs that 
were modified by the insert, then calls loadTableMetadata() which refreshes 
those partitions. Because the partition was added by ALTER TABLE RECOVER 
PARTITIONS, it got marked as "dirty" which means that the refresh ends up 
dropping and reloading it with a new partition ID. Then, createInsertEvents 
looks for the partitions by ID, but they've since been assigned new IDs, so 
they aren't found.

Digging into this a bit more to see if I can see why this affects this code 
path but not others that also use the "dirty partition" hack

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> when HMS polling is enabled
> ---
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8577) Crash during OpenSSLSocket.read

2019-05-29 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851434#comment-16851434
 ] 

Sahil Takiar commented on IMPALA-8577:
--

hmm its suspicious this hasn't happened for ABFS, which makes me think perhaps 
there is a bug in the AWS SDK that is causing this.

However, [https://github.com/wildfly/wildfly-openssl/issues/36] looks pretty 
suspicious (it reports a double-free error under high concurrency) and it was 
fixed in [https://github.com/wildfly/wildfly-openssl/pull/38] which has not 
made its way into a release yet. Going to try running against wildfly-openssl 
master and see what happens.

Trying to confirm if this is concurrency related as well (playing around with 
the value of {{fs.s3a.connection.maximum}}).

> Crash during OpenSSLSocket.read
> ---
>
> Key: IMPALA-8577
> URL: https://issues.apache.org/jira/browse/IMPALA-8577
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: David Rorke
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: 5ca78771-ad78-4a29-31f88aa6-9bfac38c.dmp, 
> hs_err_pid6313.log, 
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.ERROR.20190521-103105.6313,
>  
> impalad.drorke-impala-r5d2xl2-30w-17.vpc.cloudera.com.impala.log.INFO.20190521-103105.6313
>
>
> Impalad crashed while running a TPC-DS 10 TB run against S3.   Excerpt from 
> the stack trace (hs_err log file attached with more complete stack):
> {noformat}
> Stack: [0x7f3d095bc000,0x7f3d09dbc000],  sp=0x7f3d09db9050,  free 
> space=8180k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [impalad+0x2528a33]  
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
>  unsigned long, int)+0x133
> C  [impalad+0x2528e0f]  tcmalloc::ThreadCache::Scavenge()+0x3f
> C  [impalad+0x266468a]  operator delete(void*)+0x32a
> C  [libcrypto.so.10+0x6e70d]  CRYPTO_free+0x1d
> J 5709  org.wildfly.openssl.SSLImpl.freeBIO0(J)V (0 bytes) @ 
> 0x7f3d4dadf9f9 [0x7f3d4dadf940+0xb9]
> J 5708 C1 org.wildfly.openssl.SSLImpl.freeBIO(J)V (5 bytes) @ 
> 0x7f3d4dfd0dfc [0x7f3d4dfd0d80+0x7c]
> J 5158 C1 org.wildfly.openssl.OpenSSLEngine.shutdown()V (78 bytes) @ 
> 0x7f3d4de4fe2c [0x7f3d4de4f720+0x70c]
> J 5758 C1 org.wildfly.openssl.OpenSSLEngine.closeInbound()V (51 bytes) @ 
> 0x7f3d4de419cc [0x7f3d4de417c0+0x20c]
> J 2994 C2 
> org.wildfly.openssl.OpenSSLEngine.unwrap(Ljava/nio/ByteBuffer;[Ljava/nio/ByteBuffer;II)Ljavax/net/ssl/SSLEngineResult;
>  (892 bytes) @ 0x7f3d4db8da34 [0x7f3d4db8c900+0x1134]
> J 3161 C2 org.wildfly.openssl.OpenSSLSocket.read([BII)I (810 bytes) @ 
> 0x7f3d4dd64cb0 [0x7f3d4dd646c0+0x5f0]
> J 5090 C2 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer()I
>  (97 bytes) @ 0x7f3d4ddd9ee0 [0x7f3d4ddd9e40+0xa0]
> J 5846 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.fillInputBuffer(I)I
>  (48 bytes) @ 0x7f3d4d7acb24 [0x7f3d4d7ac7a0+0x384]
> J 5845 C1 
> com.amazonaws.thirdparty.apache.http.impl.BHttpConnectionBase.isStale()Z (31 
> bytes) @ 0x7f3d4d7ad49c [0x7f3d4d7ad220+0x27c]
> {noformat}
> The crash may not be easy to reproduce.  I've run this test multiple times 
> and only crashed once.   I have a core file if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8571) Make query-hook-execution more robust and observable

2019-05-29 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8571:
---
Description: 
The execution of {{QueryEventHook}}s currently has some drawbacks in the design:
 * exceptions thrown from hooks are simply logged by the Frontend and ignored
 * hooks that hang will forever block an executor thread (currently fixed-size 
threadpool)
 * an unbounded queue is used for scheduled hook tasks, which means that 
slow/hanging hooks may cause the queue to grow and grow until an 
{{OutOfMemoryError}} is thrown
 * metrics around hook execution are not captured (which would really help in 
debugging/diagnosis)

These are all points that should be addressed/improved in this ticket

  was:Placeholder: description coming soon


> Make query-hook-execution more robust and observable
> 
>
> Key: IMPALA-8571
> URL: https://issues.apache.org/jira/browse/IMPALA-8571
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Major
>
> The execution of {{QueryEventHook}}s currently has some drawbacks in the 
> design:
>  * exceptions thrown from hooks are simply logged by the Frontend and ignored
>  * hooks that hang will forever block an executor thread (currently 
> fixed-size threadpool)
>  * an unbounded queue is used for scheduled hook tasks, which means that 
> slow/hanging hooks may cause the queue to grow and grow until an 
> {{OutOfMemoryError}} is thrown
>  * metrics around hook execution are not captured (which would really help in 
> debugging/diagnosis)
> These are all points that should be addressed/improved in this ticket



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8572) Move query hook execution to before query unregistration

2019-05-29 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8572:
---
Description: The backend currently executes query event hooks   (was: 
Placeholder: description coming soon)

> Move query hook execution to before query unregistration
> 
>
> Key: IMPALA-8572
> URL: https://issues.apache.org/jira/browse/IMPALA-8572
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: radford nguyen
>Priority: Major
>
> The backend currently executes query event hooks 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8572) Move query hook execution to before query unregistration

2019-05-29 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8572:
---
Description: 
The backend currently executes query event hooks during 
{{ImpalaServer::UnregisterQuery}}, which may actually only happen a long time 
after the query has actually executed. We depend on either the client closing 
the query/session, the client's connection dropping, or an idle session timing 
out.

e.g. the following sequence is possible.
 # User executes query from Hue.
 # User goes home for weekend, leaving Hue tab open in browser
 # If we're lucky, the session timeout expires after some amount of idle time.
 # The query gets unregistered, hooks get executed

It would generally be desirable to move the lineage logger earlier in the query 
lifecycle, so it occurs as soon as all of the required data is available.

  was:The backend currently executes query event hooks 


> Move query hook execution to before query unregistration
> 
>
> Key: IMPALA-8572
> URL: https://issues.apache.org/jira/browse/IMPALA-8572
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: radford nguyen
>Priority: Major
>
> The backend currently executes query event hooks during 
> {{ImpalaServer::UnregisterQuery}}, which may actually only happen a long time 
> after the query has actually executed. We depend on either the client closing 
> the query/session, the client's connection dropping, or an idle session 
> timing out.
> e.g. the following sequence is possible.
>  # User executes query from Hue.
>  # User goes home for weekend, leaving Hue tab open in browser
>  # If we're lucky, the session timeout expires after some amount of idle time.
>  # The query gets unregistered, hooks get executed
> It would generally be desirable to move the lineage logger earlier in the 
> query lifecycle, so it occurs as soon as all of the required data is 
> available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8573) Implement timeout for query hook execution

2019-05-29 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen resolved IMPALA-8573.

Resolution: Duplicate

> Implement timeout for query hook execution
> --
>
> Key: IMPALA-8573
> URL: https://issues.apache.org/jira/browse/IMPALA-8573
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: radford nguyen
>Priority: Major
>
> Placeholder: description coming soon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8576) Pass lineage object instead of string to query hook

2019-05-29 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8576:
---
Description: 
The {{QueryEventHook}} interface currently takes a {{String}} for the 
{{onQueryComplete}} hook.  This string is the JSON representation of the 
lineage graph written to the legacy lineage file.

It would be better to pass the serialized {{byte[]}} of the lineage thrift 
object itself, so that we can decouple ourselves from any lineage file 
format(s).

Additionally, hook implementations should use their own version of Thrift to 
deserialize the object so that they are not tied to Impala's Thrift version.

  was:Placeholder: description coming soon


> Pass lineage object instead of string to query hook
> ---
>
> Key: IMPALA-8576
> URL: https://issues.apache.org/jira/browse/IMPALA-8576
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Priority: Major
>
> The {{QueryEventHook}} interface currently takes a {{String}} for the 
> {{onQueryComplete}} hook.  This string is the JSON representation of the 
> lineage graph written to the legacy lineage file.
> It would be better to pass the serialized {{byte[]}} of the lineage thrift 
> object itself, so that we can decouple ourselves from any lineage file 
> format(s).
> Additionally, hook implementations should use their own version of Thrift to 
> deserialize the object so that they are not tied to Impala's Thrift version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8594) Support drop table for external kudu tables that are dropped in kudu

2019-05-29 Thread Manish Maheshwari (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851510#comment-16851510
 ] 

Manish Maheshwari commented on IMPALA-8594:
---

Yes, its with LocalCatalog enabled. I am closing this one as duplicate of  
IMPALA-8459. 

> Support drop table for external kudu tables that are dropped in kudu
> 
>
> Key: IMPALA-8594
> URL: https://issues.apache.org/jira/browse/IMPALA-8594
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Manish Maheshwari
>Priority: Critical
>
> External kudu tables in Impala cannot be dropped from HMS if the kudu table 
> is already dropped in kudu. This cases HMS to be out of sync with kudu 
> metadata.
> Impala should clean up HMS table info when a drop is executed for an external 
> table that does not exist in kudu
>  
> cc - [~balazsj_impala_220b] [~tlipcon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8594) Support drop table for external kudu tables that are dropped in kudu

2019-05-29 Thread Manish Maheshwari (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Maheshwari closed IMPALA-8594.
-
Resolution: Duplicate

closed

> Support drop table for external kudu tables that are dropped in kudu
> 
>
> Key: IMPALA-8594
> URL: https://issues.apache.org/jira/browse/IMPALA-8594
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Manish Maheshwari
>Priority: Critical
>
> External kudu tables in Impala cannot be dropped from HMS if the kudu table 
> is already dropped in kudu. This cases HMS to be out of sync with kudu 
> metadata.
> Impala should clean up HMS table info when a drop is executed for an external 
> table that does not exist in kudu
>  
> cc - [~balazsj_impala_220b] [~tlipcon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8459) Cannot delete impala/kudu table if backing kudu table dropped with local catalog

2019-05-29 Thread Manish Maheshwari (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851514#comment-16851514
 ] 

Manish Maheshwari commented on IMPALA-8459:
---

Is there a temporary workaround for this other than disabling local catalog?

> Cannot delete impala/kudu table if backing kudu table dropped with local 
> catalog
> 
>
> Key: IMPALA-8459
> URL: https://issues.apache.org/jira/browse/IMPALA-8459
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: kudu
>
> test_delete_external_kudu_table and test_delete_managed_kudu_table fail with 
> local catalog, e.g. with:
> {noformat}
> E   HiveServer2Error: LocalCatalogException: Error opening Kudu table 
> 'testimpalakuduintegration_1715_p3r46w.ogslbjblgv', Kudu error: the table 
> does not exist: table_name: "testimpalakuduintegration_1715_p3r46w.ogslbjblgv"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8600) Reload partition does not work for transactional tables

2019-05-29 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created IMPALA-8600:
---

 Summary: Reload partition does not work for transactional tables
 Key: IMPALA-8600
 URL: https://issues.apache.org/jira/browse/IMPALA-8600
 Project: IMPALA
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


If a table is transactional, a reload partition call should fetch the valid 
writeIds. Without doing this, the reload will skip adding all the newly created 
delta files of the transactional table pertaining to the new writeIds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8600) Reload partition does not work for transactional tables

2019-05-29 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8600 started by Vihang Karajgaonkar.
---
> Reload partition does not work for transactional tables
> ---
>
> Key: IMPALA-8600
> URL: https://issues.apache.org/jira/browse/IMPALA-8600
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> If a table is transactional, a reload partition call should fetch the valid 
> writeIds. Without doing this, the reload will skip adding all the newly 
> created delta files of the transactional table pertaining to the new writeIds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8506) Support RenameTable DDL with Kudu/HMS integration in Catalogd mode

2019-05-29 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8506 started by Hao Hao.
---
> Support RenameTable DDL with Kudu/HMS integration in Catalogd mode
> --
>
> Key: IMPALA-8506
> URL: https://issues.apache.org/jira/browse/IMPALA-8506
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> Similar to IMPALA-8504, but for RenameTable DDL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8505) Support AlterTable DDL with Kudu/HMS integration in Catalogd mode

2019-05-29 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8505 started by Hao Hao.
---
> Support AlterTable DDL with Kudu/HMS integration in Catalogd mode
> -
>
> Key: IMPALA-8505
> URL: https://issues.apache.org/jira/browse/IMPALA-8505
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> Similar to IMPALA-8504, but for AlterTable DDL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8507) Support DropTable DDL with Kudu/HMS integration in Catalogd mode

2019-05-29 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8507 started by Hao Hao.
---
> Support DropTable DDL with Kudu/HMS integration in Catalogd mode
> 
>
> Key: IMPALA-8507
> URL: https://issues.apache.org/jira/browse/IMPALA-8507
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> Similar to IMPALA-8504, but for DropTable DDL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org