[jira] [Resolved] (IMPALA-5816) ssl-related custom cluster tests failing during setup on exhaustive RHEL7
[ https://issues.apache.org/jira/browse/IMPALA-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5816. Resolution: Fixed Fix Version/s: Impala 2.11.0 https://github.com/apache/incubator-impala/commit/c163ac1468e4d878c3516ec933c69fb66851af01 > ssl-related custom cluster tests failing during setup on exhaustive RHEL7 > - > > Key: IMPALA-5816 > URL: https://issues.apache.org/jira/browse/IMPALA-5816 > Project: IMPALA > Issue Type: Bug > Components: Security >Affects Versions: Impala 2.10.0 >Reporter: David Knupp >Assignee: Henry Robinson >Priority: Critical > Fix For: Impala 2.11.0 > > > Tests that were seen to fail include: > * TestClientSsl.test_tls_v12 > * TestClientSsl.test_wildcard_ssl > * TestClientSsl.test_wildcard_san_ssl > Example stack trace: > {noformat} > self = > method = > > def setup_method(self, method): > cluster_args = list() > for arg in [IMPALAD_ARGS, STATESTORED_ARGS, CATALOGD_ARGS]: > if arg in method.func_dict: > cluster_args.append("--%s=\"%s\" " % (arg, method.func_dict[arg])) > # Start a clean new cluster before each test > > self._start_impala_cluster(cluster_args) > common/custom_cluster_test_suite.py:103: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > common/custom_cluster_test_suite.py:129: in _start_impala_cluster > check_call(cmd + options, close_fds=True) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > popenargs = > (['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py', > > '--cluster_si...cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem > --ssl_cipher_list=AES128-GCM-SHA256 " ', ...],) > kwargs = {'close_fds': True}, retcode = 1 > cmd = > ['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py', > > '--cluster_siz...a-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem > --ssl_cipher_list=AES128-GCM-SHA256 " ', ...] > def check_call(*popenargs, **kwargs): > """Run command with arguments. Wait for command to complete. If > the exit code was zero then return, otherwise raise > CalledProcessError. The CalledProcessError object will have the > return code in the returncode attribute. > > The arguments are the same as for the Popen constructor. Example: > > check_call(["ls", "-l"]) > """ > retcode = call(*popenargs, **kwargs) > if retcode: > cmd = kwargs.get("args") > if cmd is None: > cmd = popenargs[0] > > raise CalledProcessError(retcode, cmd) > E CalledProcessError: Command > '['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py', > '--cluster_size=3', '--num_coordinators=3', > '--log_dir=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/logs/custom_cluster_tests', > '--log_level=1', > '--impalad_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem > > --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key > --ssl_minimum_version=tlsv1.2 > --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem > --ssl_cipher_list=AES128-GCM-SHA256 " ', > '--state_store_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem > > --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key > --ssl_minimum_version=tlsv1.2 > --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem > --ssl_cipher_list=AES128-GCM-SHA256 " ', > '--catalogd_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem > > --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key > --ssl_minimum_version=tlsv1.2 > --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem > --ssl_cipher_list=AES128-GCM-SHA256 " ']' returned non-zero exit status 1 > {noformat} > Standard error output: > {noformat} > MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > Mai
[jira] [Created] (IMPALA-5887) Hung union query
Henry Robinson created IMPALA-5887: -- Summary: Hung union query Key: IMPALA-5887 URL: https://issues.apache.org/jira/browse/IMPALA-5887 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Priority: Critical During an exhaustive test run on CentOS 7.0, I noticed the following query hung for 2.5 hours: {code} select count(c) from ( select bigint_col + 1 as c from functional.alltypes limit 15 union all select bigint_col as c from functional.alltypes limit 15 union all select bigint_col + 1 as c from functional.alltypes limit 15 union all (select bigint_col as c from functional.alltypes limit 15)) t {code} There was one fragment instance still running which was waiting on some hdfs scanner threads to complete. Unfortunately taking the {{pstack}} caused the threads to unblock themselves, and the query completed. {code} Thread 1 (process 18704): #0 0x7f47592a9705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x013648e5 in boost::condition_variable::wait (this=0x1b303c858, m=...) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/pthread/condition_variable.hpp:73 #2 0x01d89d8c in boost::thread::join_noexcept() () #3 0x015148ab in boost::thread::join (this=0x181d458f0) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:767 #4 0x01514f3e in impala::Thread::Join (this=0x1a983220) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.h:106 #5 0x017fa67e in impala::ThreadGroup::JoinAll (this=0x181d856c8) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:338 #6 0x0189be87 in impala::HdfsScanNode::Close (this=0x181d85000, state=0xbf38800) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:236 #7 0x01a515e1 in impala::UnionNode::GetNextMaterialized (this=0x171f5680, state=0xbf38800, row_batch=0x7f469d2561f0) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/union-node.cc:242 #8 0x01a5272b in impala::UnionNode::GetNext (this=0x171f5680, state=0xbf38800, row_batch=0x7f469d2561f0, eos=0x7f469d25639f) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/union-node.cc:297 #9 0x019e4f09 in impala::PartitionedAggregationNode::Open (this=0x25251900, state=0xbf38800) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/partitioned-aggregation-node.cc:302 #10 0x015f74ab in impala::FragmentInstanceState::Open (this=0x150b3480) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/fragment-instance-state.cc:256 #11 0x015f4fd7 in impala::FragmentInstanceState::Exec (this=0x150b3480) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/fragment-instance-state.cc:80 #12 0x015bb5d2 in impala::QueryState::ExecFInstance (this=0x16cad600, fis=0x150b3480) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/query-state.cc:351 #13 0x015ba102 in impala::QueryStateoperator()(void) const (__closure=0x7f469d256c28) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/query-state.cc:319 #14 0x015bc283 in boost::detail::function::void_function_obj_invoker0, void>::invoke(boost::detail::function::function_buffer &) (function_obj_ptr=...) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 #15 0x0152d542 in boost::function0::operator() (this=0x7f469d256c20) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 #16 0x017fa4eb in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function, impala::Promise*) (name=..., category=..., functor=..., thread_started=0x7f468003ec40) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:329 #17 0x01802e26 in boost::_bi::list4, boost::_bi::value, boost::_bi::value >, boost::_bi::value*> >::operator(), impala::Promise*), boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, std::string const&, boost::function, impala::Promise*), boost::_bi::list0&, int) (this=0x13f7cdfc0, f=@0x13f7cdfb8: 0x17fa1cc , impala::Promise*)>, a=...) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boo
[jira] [Resolved] (IMPALA-5846) Kudu libraries are written to be/src/.., not be/build/...
[ https://issues.apache.org/jira/browse/IMPALA-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5846. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/f20b1626b8bdf2a87e089cb18f82cd80a7cc981c > Kudu libraries are written to be/src/.., not be/build/... > - > > Key: IMPALA-5846 > URL: https://issues.apache.org/jira/browse/IMPALA-5846 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Any library built using {{ADD_EXPORTABLE_LIBRARY}} puts the library or > archive in the source directory it's built from, not in > {{be/build//...}}. This isn't great for isolating building different > targets, nor is it consistent with the rest of the build. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-4669) Add Kudu's RPC, util and security libraries
[ https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4669. Resolution: Fixed Fix Version/s: Impala 2.10.0 Finally, here's the RPC library: https://github.com/apache/incubator-impala/commit/c7db60aa46565c19634e8a791df3af8d116b9017 https://github.com/apache/incubator-impala/commit/113526198051f6810c84df513d507074856f5e4c > Add Kudu's RPC, util and security libraries > --- > > Key: IMPALA-4669 > URL: https://issues.apache.org/jira/browse/IMPALA-4669 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, > {{security}} and {{util}} libraries. The easiest way for now is to pull them > into trunk. > Doing this also requires upgrading our {{gutil}} version. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5849) Don't disable TLS configuration at compile-time even with OpenSSL 1.0.0
Henry Robinson created IMPALA-5849: -- Summary: Don't disable TLS configuration at compile-time even with OpenSSL 1.0.0 Key: IMPALA-5849 URL: https://issues.apache.org/jira/browse/IMPALA-5849 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson IMPALA-5800, IMPALA-5775 and IMPALA-5743 added TLS configuration to Impala and Squeasel. Since Impala is often built against different versions of OpenSSL (with different TLS capabilities), we used compile-time definitions to avoid using symbols from OpenSSL 1.0.1 that weren't available. This works great if we can ensure that the machine on which Impala is built is the same environment as the one on which it executes, but we have discovered that the installed version of OpenSSL can vary between minor releases of Linux distributions. It appears possible to write the support for TLS1.1+ in terms of symbols that are available in OpenSSL 1.0.0 only. The only downside is that Impala can't then tell whether or not the runtime supports TLS 1.2, and so the error messages won't be quite as clear. However, the benefit of a single binary and Thrift toolchain dependency for all supported versions of OpenSSL is well worth it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5846) Kudu libraries are written to be/src/.., not be/build/...
Henry Robinson created IMPALA-5846: -- Summary: Kudu libraries are written to be/src/.., not be/build/... Key: IMPALA-5846 URL: https://issues.apache.org/jira/browse/IMPALA-5846 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Any library built using {{ADD_EXPORTABLE_LIBRARY}} puts the library or archive in the source directory it's built from, not in {{be/build//...}}. This isn't great for isolating building different targets, nor is it consistent with the rest of the build. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5825) TSSLSocket factory may throw uncaught exception
[ https://issues.apache.org/jira/browse/IMPALA-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5825. Resolution: Fixed Fix Version/s: Impala 2.10.0 Fixed in https://github.com/apache/incubator-impala/commit/f9b222e9229ef3830f00b0e47073d7a8880e2bfb > TSSLSocket factory may throw uncaught exception > --- > > Key: IMPALA-5825 > URL: https://issues.apache.org/jira/browse/IMPALA-5825 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > If using TLS, Thrift's {{TSSLSocketFactory}} constructor might throw an > exception if there was an error initializing the SSL context. > We don't currently catch that error, meaning that a misconfiguration leads to > an unexpected process death. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5825) TSSLSocket factory may throw uncaught exception
Henry Robinson created IMPALA-5825: -- Summary: TSSLSocket factory may throw uncaught exception Key: IMPALA-5825 URL: https://issues.apache.org/jira/browse/IMPALA-5825 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.8.0, Impala 2.7.0, Impala 2.9.0, Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson If using TLS, Thrift's {{TSSLSocketFactory}} constructor might throw an exception if there was an error initializing the SSL context. We don't currently catch that error, meaning that a misconfiguration leads to an unexpected process death. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5811) Add per-query backends page
Henry Robinson created IMPALA-5811: -- Summary: Add per-query backends page Key: IMPALA-5811 URL: https://issues.apache.org/jira/browse/IMPALA-5811 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson It's useful when diagnosing hangs, etc, to see a quick overview of which backends have fragment instances that are still running, and whether they're reporting to the coordinator in a timely manner. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-4666) Remove thirdparty from search dir for toolchain deps
[ https://issues.apache.org/jira/browse/IMPALA-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4666. Resolution: Fixed Not sure exactly what I meant here, but I think it's been fixed with the recent shared linking improvements. > Remove thirdparty from search dir for toolchain deps > > > Key: IMPALA-4666 > URL: https://issues.apache.org/jira/browse/IMPALA-4666 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > > Lots of the {{Find*.cmake}} modules look in {{thirdparty/}} still, and > shouldn't. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5800) Configure Squeasel's TLS version / ciphers
[ https://issues.apache.org/jira/browse/IMPALA-5800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5800. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/51ec60713980bd6e64e626f4476e843c49f5ea48 > Configure Squeasel's TLS version / ciphers > -- > > Key: IMPALA-5800 > URL: https://issues.apache.org/jira/browse/IMPALA-5800 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Squeasel will be getting TLS cipher suite and version configuration after > this [pull request|https://github.com/cloudera/squeasel/pull/6] is merged. > We should import that change, then plumb through the relevant configuration > options to the Squeasel instance from Impala. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5775) Impala shell only supports TLSv1
[ https://issues.apache.org/jira/browse/IMPALA-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5775. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/e4a0e2f391bce3b8411ce7e5010856a54dc52991 > Impala shell only supports TLSv1 > > > Key: IMPALA-5775 > URL: https://issues.apache.org/jira/browse/IMPALA-5775 > Project: IMPALA > Issue Type: Bug > Components: Clients >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Per https://docs.python.org/2/library/ssl.html, we have Impala shell's SSL > client configured only to connect using TLSv1. That is, if after IMPALA-5743, > it tries to connect to a TLSv1_2 server, it won't work. > We should change the client protocol to {{SSLv23}} (I think this is > acceptable for a client - the server won't negotiate an SSL connection), > which can connect to all flavours of TLS. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5109) Increase plan fragment startup histogram max latency to > 20000ms
[ https://issues.apache.org/jira/browse/IMPALA-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5109. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/6a606ed459c173b50af7b1bd922970ac57fd17fc > Increase plan fragment startup histogram max latency to > 2ms > - > > Key: IMPALA-5109 > URL: https://issues.apache.org/jira/browse/IMPALA-5109 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > We track plan fragment start latencies, but max out at 20s in the histogram. > We should probably set that to 30 minutes or so to capture really long RPC > delays. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5743) Allow for configuration of TLS / SSL versions
[ https://issues.apache.org/jira/browse/IMPALA-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5743. Resolution: Fixed Fix Version/s: Impala 2.10.0 Fixed in https://github.com/apache/incubator-impala/commit/16ce201f5250451cb55e2cb9821b5d628d777160. Note that this requires [this toolchain commit|https://github.com/cloudera/native-toolchain/commit/fc9954b4fab21d31d5c4b99b1f64545d2c70f65b] to add TLS configuration to Thrift 0.9.0. > Allow for configuration of TLS / SSL versions > - > > Key: IMPALA-5743 > URL: https://issues.apache.org/jira/browse/IMPALA-5743 > Project: IMPALA > Issue Type: Improvement > Components: Security >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > It would be good for users to be able, via the command line, to specify > acceptable TLS protocols. > Users will typically want to specify a minimum protocol version (i.e. TLS1.0, > 1.1 or 1.2), rather than a specific protocol version. Kudu has > {{--rpc_tls_minimum_version}}, and we can follow their lead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5526) Add krb5 to toolchain
[ https://issues.apache.org/jira/browse/IMPALA-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5526. Resolution: Won't Fix We eventually decided against using krb5 in the toolchain, vs making it a system-level pre-requisite. > Add krb5 to toolchain > - > > Key: IMPALA-5526 > URL: https://issues.apache.org/jira/browse/IMPALA-5526 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > > KRPC adds a compile-time dependency on libkrb5's headers. To guarantee that > they're available in all build environments, we should add krb5 (from > http://web.mit.edu/kerberos/dist/index.html) to the toolchain. > libkrb5.so should be dynamically linked by default, to avoid creating a > binary that has statically linked security dependencies (this is an issue for > us at Cloudera as a vendor, but also a general antipattern). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5800) Configure Squeasel's TLS version / ciphers
Henry Robinson created IMPALA-5800: -- Summary: Configure Squeasel's TLS version / ciphers Key: IMPALA-5800 URL: https://issues.apache.org/jira/browse/IMPALA-5800 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Squeasel will be getting TLS cipher suite and version configuration after this [pull request|https://github.com/cloudera/squeasel/pull/6] is merged. We should import that change, then plumb through the relevant configuration options to the Squeasel instance from Impala. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5666) Use manual poisoning for ASAN with new buffer pool
[ https://issues.apache.org/jira/browse/IMPALA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5666. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/a99114283b371852254fe05eb24ac0e339cf777b > Use manual poisoning for ASAN with new buffer pool > -- > > Key: IMPALA-5666 > URL: https://issues.apache.org/jira/browse/IMPALA-5666 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Tim Armstrong >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > We should use > https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning for > the to catch bugs where memory buffers are accessed after they are freed. We > should do this for MemPools and the BufferPool to start off with and maybe > for DiskIoMgr buffers and FreePool allocations. > We can already catch this with --disable_mem_pools but it would be good to > have stricter checks enabled by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5773) Memory limit exceeded on test_spilling.py
[ https://issues.apache.org/jira/browse/IMPALA-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5773. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/f2f52a8e1ce9560329566ee71945b3901a1ef958 > Memory limit exceeded on test_spilling.py > - > > Key: IMPALA-5773 > URL: https://issues.apache.org/jira/browse/IMPALA-5773 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0 >Reporter: Michael Ho >Assignee: Henry Robinson >Priority: Blocker > Labels: broken-build > Fix For: Impala 2.10.0 > > > {noformat} > 03:55:47 FAIL > query_test/test_spilling.py::TestSpilling::()::test_spilling[exec_option: > {'default_spillable_buffer_size': '256k'} | table_format: parquet/none] > 03:55:47 === FAILURES > === > 03:55:47 TestSpilling.test_spilling[exec_option: > {'default_spillable_buffer_size': '256k'} | table_format: parquet/none] > 03:55:47 [gw1] linux2 -- Python 2.6.6 > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python > 03:55:47 query_test/test_spilling.py:39: in test_spilling > 03:55:47 self.run_test_case('QueryTest/spilling', vector) > 03:55:47 common/impala_test_suite.py:390: in run_test_case > 03:55:47 result = self.__execute_query(target_impalad_client, query, > user=user) > 03:55:47 common/impala_test_suite.py:598: in __execute_query > 03:55:47 return impalad_client.execute(query, user=user) > 03:55:47 common/impala_connection.py:160: in execute > 03:55:47 return self.__beeswax_client.execute(sql_stmt, user=user) > 03:55:47 beeswax/impala_beeswax.py:173: in execute > 03:55:47 handle = self.__execute_query(query_string.strip(), user=user) > 03:55:47 beeswax/impala_beeswax.py:339: in __execute_query > 03:55:47 self.wait_for_completion(handle) > 03:55:47 beeswax/impala_beeswax.py:359: in wait_for_completion > 03:55:47 raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > 03:55:47 E ImpalaBeeswaxException: ImpalaBeeswaxException: > 03:55:47 EQuery aborted:Memory limit exceeded: Error occurred on backend > impala-boost-static-burst-slave-1c37.vpc.cloudera.com:22000 by fragment > be415e2081bde55d:ce5cb0b40001 > 03:55:47 E Memory left in process limit: 49.35 GB > 03:55:47 E Memory left in query limit: -36546.00 B > 03:55:47 E Query(be415e2081bde55d:ce5cb0b4): memory limit exceeded. > Limit=800.00 MB Reservation=640.00 MB ReservationLimit=640.00 MB > OtherMemory=160.03 MB Total=800.03 MB Peak=800.03 MB > 03:55:47 E Fragment be415e2081bde55d:ce5cb0b4: Reservation=0 > OtherMemory=12.24 KB Total=12.24 KB Peak=63.50 KB > 03:55:47 E AGGREGATION_NODE (id=6): Total=4.00 KB Peak=4.00 KB > 03:55:47 E Exprs: Total=4.00 KB Peak=4.00 KB > 03:55:47 E EXCHANGE_NODE (id=5): Total=0 Peak=0 > 03:55:47 E DataStreamRecvr: Total=0 Peak=0 > 03:55:47 E PLAN_ROOT_SINK: Total=0 Peak=0 > 03:55:47 E CodeGen: Total=247.00 B Peak=51.50 KB > 03:55:47 E Fragment be415e2081bde55d:ce5cb0b40002: Reservation=24.50 > MB OtherMemory=157.98 MB Total=182.48 MB Peak=182.48 MB > 03:55:47 E AGGREGATION_NODE (id=2): Total=4.00 KB Peak=4.00 KB > 03:55:47 E Exprs: Total=4.00 KB Peak=4.00 KB > 03:55:47 E AGGREGATION_NODE (id=4): Reservation=24.50 MB > OtherMemory=29.12 KB Total=24.53 MB Peak=25.80 MB > 03:55:47 E Exprs: Total=21.12 KB Peak=21.12 KB > 03:55:47 E EXCHANGE_NODE (id=3): Total=0 Peak=0 > 03:55:47 E DataStreamRecvr: Total=157.20 MB Peak=157.20 MB > 03:55:47 E DataStreamSender (dst_id=5): Total=16.00 KB Peak=16.00 KB > 03:55:47 E CodeGen: Total=5.09 KB Peak=373.50 KB > 03:55:47 E Fragment be415e2081bde55d:ce5cb0b40001: Reservation=615.50 > MB OtherMemory=2.04 MB Total=617.54 MB Peak=651.88 MB > 03:55:47 E AGGREGATION_NODE (id=1): Reservation=615.50 MB > OtherMemory=2.02 MB Total=617.52 MB Peak=618.70 MB > 03:55:47 E Exprs: Total=2.02 MB Peak=2.02 MB > 03:55:47 E HDFS_SCAN_NODE (id=0): Total=0 Peak=33.32 MB > 03:55:47 E DataStreamSender (dst_id=3): Total=7.52 KB Peak=7.52 KB > 03:55:47 E CodeGen: Total=6.84 KB Peak=522.50 KB > 03:55:47 E > 03:55:47 E Memory limit exceeded: Error occurred on backend > impala-boost-static-burst-slave-1c37.vpc.cloudera.com:22000 by fragment > be415e2081bde55d:ce5cb0b40001 > 03:55:47 E Memory left in process limit: 49.35 GB > 03:55:47 E Memory left in query limit: -36546.00 B > 03:55:47 E Query(be415e2081bde55d:ce5cb0b4): memory limit exceeded. > Limit=800.00 M
[jira] [Resolved] (IMPALA-5781) thrift-server-test failed
[ https://issues.apache.org/jira/browse/IMPALA-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5781. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/cfcbfab4ff6df0092e68b169c46958467fc0ec14 > thrift-server-test failed > - > > Key: IMPALA-5781 > URL: https://issues.apache.org/jira/browse/IMPALA-5781 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.10.0 >Reporter: Michael Ho >Assignee: Henry Robinson >Priority: Blocker > Fix For: Impala 2.10.0 > > > [~henryr], can you please take a first look ? This test was touched by a > recent > [commit|https://github.com/apache/incubator-impala/commit/68df21b426feca8e7a458152d8dca1b7e1335bcb] > of yours > {noformat} > 15:39:18 49/86 Test #49: thrift-server-test ...***Exception: > SegFault 1.15 sec > 15:39:18 Turning perftools heap leak checking off > 15:39:18 [==] Running 12 tests from 4 test cases. > 15:39:18 [--] Global test environment set-up. > 15:39:18 [--] 1 test from ThriftServer > 15:39:18 [ RUN ] ThriftServer.Connectivity > 15:39:18 [ OK ] ThriftServer.Connectivity (43 ms) > 15:39:18 [--] 1 test from ThriftServer (43 ms total) > 15:39:18 > 15:39:18 [--] 7 tests from SslTest > 15:39:18 [ RUN ] SslTest.Connectivity > 15:39:18 [ OK ] SslTest.Connectivity (13 ms) > 15:39:18 [ RUN ] SslTest.BadCertificate > 15:39:18 [ OK ] SslTest.BadCertificate (3 ms) > 15:39:18 [ RUN ] SslTest.ClientBeforeServer > 15:39:18 [ OK ] SslTest.ClientBeforeServer (7 ms) > 15:39:18 [ RUN ] SslTest.BadCiphers > 15:39:18 [ OK ] SslTest.BadCiphers (5 ms) > 15:39:18 [ RUN ] SslTest.MismatchedCiphers > 15:39:18 > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:238: > Failure > 15:39:18 Value of: status_.ok() > 15:39:18 Actual: false > 15:39:18 Expected: true > 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no > cipher match > 15:39:18 > 15:39:18 > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:246: > Failure > 15:39:18 Value of: status_.ok() > 15:39:18 Actual: false > 15:39:18 Expected: true > 15:39:18 Error: Couldn't open transport for localhost:57370 (connect() > failed: Connection refused) > 15:39:18 > 15:39:18 [ FAILED ] SslTest.MismatchedCiphers (12 ms) > 15:39:18 [ RUN ] SslTest.MatchedCiphers > 15:39:18 > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:263: > Failure > 15:39:18 Value of: status_.ok() > 15:39:18 Actual: false > 15:39:18 Expected: true > 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no > cipher match > 15:39:18 > 15:39:18 > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:270: > Failure > 15:39:18 Value of: status_.ok() > 15:39:18 Actual: false > 15:39:18 Expected: true > 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no > cipher match > 15:39:18 > 15:39:18 Wrote minidump to > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/logs/be_tests/minidumps/thrift-server-test/7d6ed95d-b688-43c7-b5af0284-7431e2f5.dmp > 15:39:18 Wrote minidump to > /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/logs/be_tests/minidumps/thrift-server-test/7d6ed95d-b688-43c7-b5af0284-7431e2f5.dmp > 15:39:18 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5785) Purge local connection pool if node statestore marks node offline
[ https://issues.apache.org/jira/browse/IMPALA-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5785. Resolution: Fixed Fix Version/s: (was: Impala 2.10.0) Impala 2.0 This already happens for nodes that have ever run a query. See https://github.com/apache/incubator-impala/blob/master/be/src/service/impala-server.cc#L1573. My understanding is that is sufficient - if you've seen a bug that's attributable to this not working, please attach some more information! We don't have a lot of testing for this path. > Purge local connection pool if node statestore marks node offline > - > > Key: IMPALA-5785 > URL: https://issues.apache.org/jira/browse/IMPALA-5785 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Lars Volker >Priority: Critical > Fix For: Impala 2.0 > > > From time to time there seem to be issues with stale connection pool entries > when nodes restart. In cases where the backend receives an update from the > statestore that a node has gone offline, we should remove connections to that > node from the connection pool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5696) Enable cipher configuration when using TLS w/Thrift
[ https://issues.apache.org/jira/browse/IMPALA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5696. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/68df21b426feca8e7a458152d8dca1b7e1335bcb IMPALA-5696: Enable cipher configuration when using TLS / Thrift The 'cipher suite' is a description of the set of algorithms used by SSL and TLS to execute key exchange, encryption, message authentication, and random number generation functions. SSL implementations allow the cipher suite to be configured so that ciphers may be removed from the whitelist if they are shown to be weak. * Add a flag --ssl_cipher_list which controls cipher selection for both thrift servers and clients. Default is blank, which means use all available cipher suites. * Add ThriftServerBuilder to simplify construction of ThriftServers (whose constructors were otherwise getting very long). Testing: new tests added to thrift-server-test. Test cases added follow: * A client cannot connect to a server which does not have any ciphers in common with it. * If ciphers are identical on clients and servers, that ssl connections can be made. * Bad cipher strings lead to errors on both client and server. > Enable cipher configuration when using TLS w/Thrift > --- > > Key: IMPALA-5696 > URL: https://issues.apache.org/jira/browse/IMPALA-5696 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Thrift's {{TSSLSocketFactory}} has a {{cipher()}} method that we can use to > configure the ciphers used by OpenSSL. We just need to connect it up to a > flag that the user provides. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5774) StringFunctions::FindInSet() may read one byte beyond a string's extent
[ https://issues.apache.org/jira/browse/IMPALA-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5774. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/5caadbbedd1917019937290e9427fd6f798f0cd8 > StringFunctions::FindInSet() may read one byte beyond a string's extent > --- > > Key: IMPALA-5774 > URL: https://issues.apache.org/jira/browse/IMPALA-5774 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > The following may read {{str_set.ptr[str_set.len]}} if no ',' is found. > {code} > while(str_set.ptr[end] != ',' && end < str_set.len) ++end; > {code} > (This was discovered by poisoning mempool data from IMPALA-5666). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5775) Impala shell only supports TLSv1
Henry Robinson created IMPALA-5775: -- Summary: Impala shell only supports TLSv1 Key: IMPALA-5775 URL: https://issues.apache.org/jira/browse/IMPALA-5775 Project: IMPALA Issue Type: Bug Components: Clients Reporter: Henry Robinson Assignee: Henry Robinson Per https://docs.python.org/2/library/ssl.html, we have Impala shell's SSL client configured only to connect using TLSv1. That is, if after IMPALA-5743, it tries to connect to a TLSv1_2 server, it won't work. We should change the client protocol to {{SSLv23}} (I think this is acceptable for a client - the server won't negotiate an SSL connection), which can connect to all flavours of TLS. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5774) StringFunctions::FindInSet() may read one byte beyond a string's extent
Henry Robinson created IMPALA-5774: -- Summary: StringFunctions::FindInSet() may read one byte beyond a string's extent Key: IMPALA-5774 URL: https://issues.apache.org/jira/browse/IMPALA-5774 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson The following may read {{str_set.ptr[str_set.len]}} if no ',' is found. {code} while(str_set.ptr[end] != ',' && end < str_set.len) ++end; {code} (This was discovered by poisoning mempool data from IMPALA-5666). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5742) Memory leak in parquet-reader
[ https://issues.apache.org/jira/browse/IMPALA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5742. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/b55ec3f64f2a16259d4c5cd2e881701fee4c603f > Memory leak in parquet-reader > - > > Key: IMPALA-5742 > URL: https://issues.apache.org/jira/browse/IMPALA-5742 > Project: IMPALA > Issue Type: Bug >Reporter: Jim Apple >Assignee: Henry Robinson >Priority: Minor > Labels: newbie > Fix For: Impala 2.10.0 > > > Line 209 of parquet-reader {{malloc}}s memory it never frees, breaking ASAN > tests on https://jenkins.impala.io: > {noformat} > TestHdfsParquetTableWriter.test_def_level_encoding[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw0] linux2 -- Python 2.7.6 > /home/ubuntu/Impala/bin/../infra/python/env/bin/python > query_test/test_insert_parquet.py:228: in test_def_level_encoding > os.path.join(tmp_dir, str(f))]) > /usr/lib/python2.7/subprocess.py:540: in check_call > raise CalledProcessError(retcode, cmd) > E CalledProcessError: Command > '['/home/ubuntu/Impala/be/build/debug/util/parquet-reader', '--file', > '/tmp/tmpbnxrl3/8948dc471cad29c8-45c9c8180003_942829264_data.0.parq']' > returned non-zero exit status 1 > {noformat} > {noformat} > ERROR: LeakSanitizer: detected memory leaks > Direct leak of 43833 byte(s) in 1 object(s) allocated from: > #0 0x1065588 in __interceptor_malloc > /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:52 > #1 0x109b42c in main > /home/ubuntu/Impala/be/src/util/parquet-reader.cc:209:48 > #2 0x7f08e0557f44 in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x21f44) > SUMMARY: AddressSanitizer: 43833 byte(s) leaked in 1 allocation(s). > -- executing against localhost:21000 > drop table test_def_level_encoding_54e4df6c.test_hdfs_parquet_table_writer; > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5758) Enable LSAN for tests
Henry Robinson created IMPALA-5758: -- Summary: Enable LSAN for tests Key: IMPALA-5758 URL: https://issues.apache.org/jira/browse/IMPALA-5758 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson [LSAN|https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer] support would be good to catch any leaks. It works well and quickly, but has a number of false or inactionable positives, mostly in the JVM. We can suppress those via a configuration file, and enable LSAN during our ASAN runs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5743) Allow for configuration of TLS / SSL versions
Henry Robinson created IMPALA-5743: -- Summary: Allow for configuration of TLS / SSL versions Key: IMPALA-5743 URL: https://issues.apache.org/jira/browse/IMPALA-5743 Project: IMPALA Issue Type: Improvement Components: Security Affects Versions: Impala 2.8.0, Impala 2.7.0, Impala 2.9.0, Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson It would be good for users to be able, via the command line, to specify acceptable TLS protocols. Users will typically want to specify a minimum protocol version (i.e. TLS1.0, 1.1 or 1.2), rather than a specific protocol version. Kudu has {{--rpc_tls_minimum_version}}, and we can follow their lead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5716) Switching to / from distcc can delete cmake_modules/*
[ https://issues.apache.org/jira/browse/IMPALA-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5716. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/41e3055f925093f971a3a800ae9601728ff9e37c > Switching to / from distcc can delete cmake_modules/* > - > > Key: IMPALA-5716 > URL: https://issues.apache.org/jira/browse/IMPALA-5716 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > If {{$IMPALA_HOME}} ends with a /, the {{clean_cmake_files}} function in > {{distcc_env.sh}} will emit a {{find}} command with a double // at the end > for the {{cmake_modules}} directory, and since it contains the substring > {{cmake}}, {{find}} will match and delete its contents. > Fix is to strip trailing /s from IMPALA_HOME in that method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5729) GVO sometimes failing due to apparent kudu tablet server crash
Henry Robinson created IMPALA-5729: -- Summary: GVO sometimes failing due to apparent kudu tablet server crash Key: IMPALA-5729 URL: https://issues.apache.org/jira/browse/IMPALA-5729 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Priority: Critical See e.g. https://jenkins.impala.io/job/gerrit-verify-dryrun/937/consoleFull. {code} 00:44:28 ] E HiveServer2Error: AnalysisException: Error opening Kudu table 'impala::tpch_kudu.lineitem', Kudu error: can not complete before timeout: KuduRpc(method=GetTableSchema, tablet=null, attempt=94, DeadlineTracker(timeout=18, elapsed=179403), Traces: [0ms] querying master, [0ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [0ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [1ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [22ms] querying master, [22ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [22ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [23ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [42ms] querying master, [42ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [42ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [43ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [62ms] querying master, [63ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [63ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [63ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [82ms] querying master, [82ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [82ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [83ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [102ms] querying master, [102ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [103ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [103ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [162ms] querying master, [162ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [162ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [163ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [242ms] querying master, [242ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [242ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [243ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [362ms] querying master, [362ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [362ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 response Network error: [peer master-127.0.0.1:7051] connection closed, [363ms] delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-127.0.0.1:7051] connection closed, [763ms] q
[jira] [Created] (IMPALA-5719) ODR violation in UDF tests
Henry Robinson created IMPALA-5719: -- Summary: ODR violation in UDF tests Key: IMPALA-5719 URL: https://issues.apache.org/jira/browse/IMPALA-5719 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Priority: Minor At first glance this looks to me like a test linking error (because it includes both {{ImpalaUdf}} and {{Udf}} libraries). {code} 03:59:36 ==42374==ERROR: AddressSanitizer: odr-violation (0x2acb91844a00): 03:59:36 [1] size=4 'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' /home/ubuntu/Impala/be/src/udf/udf.cc:121:32 03:59:36 [2] size=4 'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' /home/ubuntu/Impala/be/src/udf/udf.cc:121:32 03:59:36 These globals were registered at these points: 03:59:36 [1]: 03:59:36 #0 0x7d1f26 in __asan_register_globals /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_globals.cc:218 03:59:36 #1 0x2acb9183a36b in asan.module_ctor (/home/ubuntu/Impala/be/build/debug/udf/libImpalaUdf.so+0x1936b) 03:59:36 03:59:36 [2]: 03:59:36 #0 0x7d1f26 in __asan_register_globals /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_globals.cc:218 03:59:36 #1 0x2acb95260d2b in asan.module_ctor (/home/ubuntu/Impala/be/build/debug/udf/libUdf.so+0x1fd2b) 03:59:36 03:59:36 ==42374==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 03:59:36 SUMMARY: AddressSanitizer: odr-violation: global 'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' at /home/ubuntu/Impala/be/src/udf/udf.cc:121:32 03:59:36 ==42374==ABORTING {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5709) Remove mini-impala-cluster
[ https://issues.apache.org/jira/browse/IMPALA-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5709. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/d2d7328dd3aa1051bcb5329c5bea8bdc1850d281 > Remove mini-impala-cluster > -- > > Key: IMPALA-5709 > URL: https://issues.apache.org/jira/browse/IMPALA-5709 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > As far as I know, {{mini-impala-cluster}} isn't used by any tests, nor does > any developer I know use it. Let's remove it - better to run real Impala > processes locally anyhow. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5716) Switching to / from distcc can delete cmake_modules/*
Henry Robinson created IMPALA-5716: -- Summary: Switching to / from distcc can delete cmake_modules/* Key: IMPALA-5716 URL: https://issues.apache.org/jira/browse/IMPALA-5716 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor If {{$IMPALA_HOME}} ends with a /, the {{clean_cmake_files}} function in {{distcc_env.sh}} will emit a {{find}} command with a double // at the end for the {{cmake_modules}} directory, and since it contains the substring {{cmake}}, {{find}} will match and delete its contents. Fix is to strip trailing /s from IMPALA_HOME in that method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-4905) Fragments always report insert status, even if not insert query
[ https://issues.apache.org/jira/browse/IMPALA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4905. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/d25db64f0e17092af9ef60eb37ec9214900c2d1c > Fragments always report insert status, even if not insert query > --- > > Key: IMPALA-4905 > URL: https://issues.apache.org/jira/browse/IMPALA-4905 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.7.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > {code} > if (done) { > TInsertExecStatus insert_status; > if (runtime_state->hdfs_files_to_move()->size() > 0) { > > insert_status.__set_files_to_move(*runtime_state->hdfs_files_to_move()); > } > if (runtime_state->per_partition_status()->size() > 0) { > > insert_status.__set_per_partition_status(*runtime_state->per_partition_status()); > } > params.__set_insert_exec_status(insert_status); > } > {code} > This means that any fragment will always set {{insert_exec_status}} in its > response, even if it's not an INSERT query. > However, in the RPC handler, {{Coordinator::UpdateFragmentExecStatus()}}, we > have: > {code} > if (params.done && params.__isset.insert_exec_status) { > lock_guard l(lock_); > // Merge in table update data (partitions written to, files to be moved > as part of > // finalization) > for (const PartitionStatusMap::value_type& partition: > params.insert_exec_status.per_partition_status) { > // etc > {code} > which means that the RPC will always try and take the query exec state lock, > for every 'done' report. With lots of fragment instances, this can lead to > some severe serialisation of reports when the query finishes. > The simplest workaround is not to set {{insert_exec_status}} for {{SELECT}} > queries. But a better solution (that will help INSERTs as well) is not to try > and do the merge here, but instead in > {{Coordinator::FinalizeSuccessfulInsert()}}, saving the {{TInsertExecStatus}} > in the fragment instance state until that point. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5709) Remove mini-impala-cluster
Henry Robinson created IMPALA-5709: -- Summary: Remove mini-impala-cluster Key: IMPALA-5709 URL: https://issues.apache.org/jira/browse/IMPALA-5709 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Priority: Minor As far as I know, {{mini-impala-cluster}} isn't used by any tests, nor does any developer I know use it. Let's remove it - better to run real Impala processes locally anyhow. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5703) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO
[ https://issues.apache.org/jira/browse/IMPALA-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5703. Resolution: Duplicate Dupe of IMPALA-5702 > TestAdmissionControllerStress::test_admission_controller_with_flags fails > intermittently in GVO > --- > > Key: IMPALA-5703 > URL: https://issues.apache.org/jira/browse/IMPALA-5703 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson > > For example: > https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console > {{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries: > 30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > text/none | round_robin_submission: True] FAILED}} > {code} > 09:19:30 ] Thread-3: ImpalaBeeswaxException: > 09:19:30 ] INNER EXCEPTION: > 09:19:30 ] MESSAGE: std::bad_cast > 09:19:30 ] Traceback (most recent call last): > 09:19:30 ] File > "/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line > 592, in run > 09:19:30 ] raise e > 09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException: > 09:19:30 ] INNER EXCEPTION: > 09:19:30 ] MESSAGE: std::bad_cast > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5703) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO
Henry Robinson created IMPALA-5703: -- Summary: TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO Key: IMPALA-5703 URL: https://issues.apache.org/jira/browse/IMPALA-5703 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson For example: https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console {{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries: 30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none | round_robin_submission: True] FAILED}} {code} 09:19:30 ] Thread-3: ImpalaBeeswaxException: 09:19:30 ] INNER EXCEPTION: 09:19:30 ] MESSAGE: std::bad_cast 09:19:30 ] Traceback (most recent call last): 09:19:30 ] File "/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line 592, in run 09:19:30 ] raise e 09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException: 09:19:30 ] INNER EXCEPTION: 09:19:30 ] MESSAGE: std::bad_cast {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5702) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO
Henry Robinson created IMPALA-5702: -- Summary: TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO Key: IMPALA-5702 URL: https://issues.apache.org/jira/browse/IMPALA-5702 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson For example: https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console {{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries: 30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none | round_robin_submission: True] FAILED}} {code} 09:19:30 ] Thread-3: ImpalaBeeswaxException: 09:19:30 ] INNER EXCEPTION: 09:19:30 ] MESSAGE: std::bad_cast 09:19:30 ] Traceback (most recent call last): 09:19:30 ] File "/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line 592, in run 09:19:30 ] raise e 09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException: 09:19:30 ] INNER EXCEPTION: 09:19:30 ] MESSAGE: std::bad_cast {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5532) Don't heap-allocate compressor objects in RowBatch
[ https://issues.apache.org/jira/browse/IMPALA-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5532. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/f3d8ccdf0f19b0b4077df517cf604a863c55bb37 > Don't heap-allocate compressor objects in RowBatch > -- > > Key: IMPALA-5532 > URL: https://issues.apache.org/jira/browse/IMPALA-5532 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Every call to {{RowBatch::RowBatch(..., const TRowBatch&, ...)}} or > {{RowBatch::Serialize()}} creates a (de)compressor object. That uses the > {{Codec::CreateCompressor()}} interface which returns a pointer to a > {{Codec}} object, so that the virtual compression interface can be used. > However, we always use LZ4 compression, and so needlessly heap-allocate the > compressor objects to get the advantage of implementation hiding which we > don't actually need. We should just declare a stack-allocated LZ4 > (de)compressor when needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5688) Speed up a couple of heavy-hitting expr-tests
[ https://issues.apache.org/jira/browse/IMPALA-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5688. Resolution: Fixed https://github.com/apache/incubator-impala/commit/1653419bd8b3748bbc0e3d5e7ffa1d412bc4b50f > Speed up a couple of heavy-hitting expr-tests > - > > Key: IMPALA-5688 > URL: https://issues.apache.org/jira/browse/IMPALA-5688 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > Two tests ({{LongReverse}} and the base64 tests in {{StringFunctions}}) run > their tests over all lengths from 0..{{some length}}. Both take several > minutes to complete. This adds a lot of runtime for not much more confidence. > If instead we pick a set of 'interesting' (including powers-of-two, prime > numbers, edge-cases) lengths, we can get a similar amount of confidence while > significantly reducing the runtime of expr-test. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-3937) Deprecate --be_service_threads
[ https://issues.apache.org/jira/browse/IMPALA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-3937. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/ed7324431d16a37a279d730a036197fc9019c3ce > Deprecate --be_service_threads > -- > > Key: IMPALA-3937 > URL: https://issues.apache.org/jira/browse/IMPALA-3937 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Labels: newbie > Fix For: Impala 2.10.0 > > > {{be_service_threads}} hasn't done anything in probably 4+ years. We should > deprecate it (in the flags text) and stop referring to it in the code (it's > passed as a constructor parameter to {{ThriftServer}}, but it doesn't have > any effect there). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-3655) Upgrade Thrift dependency to 0.9.2 or 0.9.3
[ https://issues.apache.org/jira/browse/IMPALA-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-3655. Resolution: Duplicate Just discovered this, which is a dupe of the ticket I filed yesterday (IMPALA-5690) - I put all my notes there, so closing this one. > Upgrade Thrift dependency to 0.9.2 or 0.9.3 > --- > > Key: IMPALA-3655 > URL: https://issues.apache.org/jira/browse/IMPALA-3655 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > > We should upgrade Thrift to pull in some needed bugfixes and improvements. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5696) Enable cipher configuration when using TLS w/Thrift
Henry Robinson created IMPALA-5696: -- Summary: Enable cipher configuration when using TLS w/Thrift Key: IMPALA-5696 URL: https://issues.apache.org/jira/browse/IMPALA-5696 Project: IMPALA Issue Type: Improvement Components: Distributed Exec Affects Versions: Impala 2.8.0, Impala 2.6.0, Impala 2.7.0, Impala 2.9.0 Reporter: Henry Robinson Thrift's {{TSSLSocketFactory}} has a {{cipher()}} method that we can use to configure the ciphers used by OpenSSL. We just need to connect it up to a flag that the user provides. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5690) Upgrade Thrift version to 0.9.3
Henry Robinson created IMPALA-5690: -- Summary: Upgrade Thrift version to 0.9.3 Key: IMPALA-5690 URL: https://issues.apache.org/jira/browse/IMPALA-5690 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson There are several good reasons to move from Thrift 0.9.0 to 0.9.3, including harmonization with other projects that we link against in one form or another. I have started to investigate upgrading, and it's not trivial. Here are the things I've run into: 1. 0.9.3 defines operator<< for all Thrift structures, conflicting with some of our bespoke implementations. 2. {{TAcceptQueueServer}} is written against an old server interface, and needs to be updated. 3. To build on all the platforms that I care about, a modern Bison install is necessary (this is an issue for native-toolchain, not Apache Impala). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5688) Speed up a couple of heavy-hitting expr-tests
Henry Robinson created IMPALA-5688: -- Summary: Speed up a couple of heavy-hitting expr-tests Key: IMPALA-5688 URL: https://issues.apache.org/jira/browse/IMPALA-5688 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Fix For: Impala 2.10.0 Two tests ({{LongReverse}} and the base64 tests in {{StringFunctions}}) run their tests over all lengths from 0..{{some length}}. Both take several minutes to complete. This adds a lot of runtime for not much more confidence. If instead we pick a set of 'interesting' (including powers-of-two, prime numbers, edge-cases) lengths, we can get a similar amount of confidence while significantly reducing the runtime of expr-test. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-4925) Coordinator does not cancel fragments if query completes w/limit
[ https://issues.apache.org/jira/browse/IMPALA-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4925. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/5bb48ed71dc8272fdabac45a33b515cdd0d5f12d > Coordinator does not cancel fragments if query completes w/limit > > > Key: IMPALA-4925 > URL: https://issues.apache.org/jira/browse/IMPALA-4925 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > If a plan has a limit, the coordinator will eventually set > {{Coordinator::returned_all_results_}} once the limit has been hit. At this > point, it should start to cancel fragment instances that are still running. > This happens usually either through an explicit cancel RPC, or returning a > non-OK status to the heartbeat {{ReportExecStatus()}} RPC. In the limit case, > neither happen - the query status is not set to {{\!ok()}} (because the query > succeeded!), so there's no 'bad' status to propagate to the fragment instance. > In many cases this doesn't matter because the cancellation propagates from > the top down: the root instance will get closed and go away, and then any > senders to that instance will notice and cancel themselves, and so on. But > there are plan shapes that mean a lot of CPU time is wasted after the query > should have finished, e.g.: > {code} > with l as (select 1 from functional.alltypes group by month), r as > (select count(*) from lineitem a CROSS JOIN lineitem b) > SELECT * from l UNION ALL (select * from r) LIMIT 2{code} > This convoluted query illustrates the idea: table {{l}} is the left union > child, and gets evaluated first. It produces more than two rows, so the limit > gets hit. The right child, in the meantime, is evaluating the cross join > before the aggregation, which is very cpu heavy. When the limit is hit, the > query hangs (from the client's perspective), waiting for the right child to > produce no results. > The fix for this is easy: fragment instances should learn about query > termination from {{ReportExecStatus()}} RPCs. If {{results_returned_}} is > true, the coordinator should return a non-OK status, causing instance > tear-down next time the instance checks its cancellation state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5670) Remove redundant c'tor code from ExecEnv
[ https://issues.apache.org/jira/browse/IMPALA-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5670. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/ab287955d00939531b5bc6b9871fcb24def9d38e > Remove redundant c'tor code from ExecEnv > > > Key: IMPALA-5670 > URL: https://issues.apache.org/jira/browse/IMPALA-5670 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > {{ExecEnv}} has two constructors that do pretty much the same thing. We > should use a delegating constructor from one to the other to reduce code > duplication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5684) Use gtest's sharding support to parallelise long-running be-tests
Henry Robinson created IMPALA-5684: -- Summary: Use gtest's sharding support to parallelise long-running be-tests Key: IMPALA-5684 URL: https://issues.apache.org/jira/browse/IMPALA-5684 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Googletest has support for sharding test cases from a single test across different processes. We could use this to speed up the execution of some backend tests - particularly {{expr-test}}. The runtime of each expr-test is heavily skewed, but once we have sharding we can make the test cases a bit more fine-grained and then automatically get better performance as the sharding handles the work balancing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5659) glog / gflags should be dynamically linked if Impala is
[ https://issues.apache.org/jira/browse/IMPALA-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5659. Resolution: Fixed Fix Version/s: Impala 2.10.0 Toolchain commit: https://github.com/cloudera/native-toolchain/commit/f32e122eaa9932f52b7c3f4c205045f3522e88dd Impala commit: https://github.com/apache/incubator-impala/commit/d79e01ef9fec559d4ebe57d41539f4e4164ae78f > glog / gflags should be dynamically linked if Impala is > --- > > Key: IMPALA-5659 > URL: https://issues.apache.org/jira/browse/IMPALA-5659 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > The glog and gflags libraries are currently always statically linked against > Impala, whether or not BUILD_SHARED_LIBS is true. > However, that can cause a problem if one of our libraries itself tries to > link against glog or gflags: the google library will be linked twice in the > final binary, and that causes problems for these particular libraries that > require that they are linked at most once. > The proposed fix is to dynamically link glog and gflags if BUILD_SHARED_LIBS > is true. > This is not an issue in our current code, but making this fix future-proofs > us against running into the problem later (and the kudu util library has this > exact issue, as it tries to link against glog directly). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5673) Track exchange node buffers memory as part of memory reservation
[ https://issues.apache.org/jira/browse/IMPALA-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5673. Resolution: Duplicate Duplicate of IMPALA-5485 (feel free to move that to a sub-task if you need for tracking). > Track exchange node buffers memory as part of memory reservation > - > > Key: IMPALA-5673 > URL: https://issues.apache.org/jira/browse/IMPALA-5673 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Mostafa Mokhtar >Assignee: Tim Armstrong > > Queries with a large number of exchange operators end up with untracked > memory in the form of buffer space allocated per DataStreamRecvr. > exchg_node_buffer_size_bytes can be used to calculate how much memory will be > used by DataStreamRecvr. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5671) Union node may evaluate all children even if limit is reached
Henry Robinson created IMPALA-5671: -- Summary: Union node may evaluate all children even if limit is reached Key: IMPALA-5671 URL: https://issues.apache.org/jira/browse/IMPALA-5671 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson The loop inside {{UnionNode::GetNextMaterialized()}} does not break if the limit has been reached. See [here|https://github.com/apache/incubator-impala/blob/master/be/src/exec/union-node.cc#L193]. The only way the loop can be broken is if either the children are exhausted, or the current row batch becomes full. If you have a union node with a limit of 1, and two children - the first of which is very cheap to evaluate and returns one row, but the second is very expensive - the union node will try to fill an entire row batch with rows, and end up waiting on the second child, even though the node could be finished after reading one row from the first child. The result is a query that takes much longer to complete than it should. Here's an example: {code} with l as (select 1 from functional.alltypes group by month), r as (select count(*) from lineitem a CROSS JOIN lineitem b) SELECT * from l UNION ALL (select * from r) LIMIT 2 {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5670) Remove redundant c'tor code from ExecEnv
Henry Robinson created IMPALA-5670: -- Summary: Remove redundant c'tor code from ExecEnv Key: IMPALA-5670 URL: https://issues.apache.org/jira/browse/IMPALA-5670 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor {{ExecEnv}} has two constructors that do pretty much the same thing. We should use a delegating constructor from one to the other to reduce code duplication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5659) glog / gflags should be dynamically linked if Impala is
Henry Robinson created IMPALA-5659: -- Summary: glog / gflags should be dynamically linked if Impala is Key: IMPALA-5659 URL: https://issues.apache.org/jira/browse/IMPALA-5659 Project: IMPALA Issue Type: Improvement Components: Infrastructure Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor The glog and gflags libraries are currently always statically linked against Impala, whether or not BUILD_SHARED_LIBS is true. However, that can cause a problem if one of our libraries itself tries to link against glog or gflags: the google library will be linked twice in the final binary, and that causes problems for these particular libraries that require that they are linked at most once. The proposed fix is to dynamically link glog and gflags if BUILD_SHARED_LIBS is true. This is not an issue in our current code, but making this fix future-proofs us against running into the problem later (and the kudu util library has this exact issue, as it tries to link against glog directly). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5481) RowDescriptors should be shared, rather than copied
[ https://issues.apache.org/jira/browse/IMPALA-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5481. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/317c413a00bd9b3b29eeaf2efe556c2e924e2d74 > RowDescriptors should be shared, rather than copied > --- > > Key: IMPALA-5481 > URL: https://issues.apache.org/jira/browse/IMPALA-5481 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > One of the {{RowBatch}} c'tors copies the row descriptor into the row batch. > This leads to a lot of allocation churn since {{RowDescriptor}} contains some > vector members, and since the descriptor is usually the same the copies are > unnecessary. > Instead, we should consider allocating the {{RowDescriptor}} once from an > object pool, and sharing it amongst all row batches that need that > descriptor. > In some tests, {{RowDescriptor()}} shows up as 20% of the tcmalloc allocation > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-1514) DataStreamSender (and possible Coordinator) has too many sender threads
[ https://issues.apache.org/jira/browse/IMPALA-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-1514. Resolution: Duplicate Fix Version/s: Product Backlog Covered by IMPALA-2567 > DataStreamSender (and possible Coordinator) has too many sender threads > --- > > Key: IMPALA-1514 > URL: https://issues.apache.org/jira/browse/IMPALA-1514 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.0 >Reporter: Alan Choi >Assignee: Henry Robinson >Priority: Minor > Labels: performance > Fix For: Product Backlog > > > DataStreamSender creates one thread per EXCHANGE destination per query. On a > large cluster with a highly concurrent workload, this will create too many > threads. The immediate impact is that the thread creation time is dominating > the query execution time (i.e. the prepare time is getting very high). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5532) Don't heap-allocate compressor objects in RowBatch
Henry Robinson created IMPALA-5532: -- Summary: Don't heap-allocate compressor objects in RowBatch Key: IMPALA-5532 URL: https://issues.apache.org/jira/browse/IMPALA-5532 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Every call to {{RowBatch::RowBatch(..., const TRowBatch&, ...)}} or {{RowBatch::Serialize()}} creates a (de)compressor object. That uses the {{Codec::CreateCompressor()}} interface which returns a pointer to a {{Codec}} object, so that the virtual compression interface can be used. However, we always use LZ4 compression, and so needlessly heap-allocate the compressor objects to get the advantage of implementation hiding which we don't actually need. We should just declare a stack-allocated LZ4 (de)compressor when needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5528) tcmalloc contention much higher with concurrency after KRPC patch
Henry Robinson created IMPALA-5528: -- Summary: tcmalloc contention much higher with concurrency after KRPC patch Key: IMPALA-5528 URL: https://issues.apache.org/jira/browse/IMPALA-5528 Project: IMPALA Issue Type: Sub-task Components: Distributed Exec Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Critical Our testing has revealed that under high concurrency (e.g. the {{many_independent_fragment_instances}} primitive), KRPC slows down execution significantly. This JIRA is to track the overall issue, and to link to JIRAs for specific spot fixes. This is the result of running {{perf}} on a node in a 16-node cluster, running the {{many_independent_fragment_instances}} primitive. {code} - 13.12% impalad impalad [.] tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**) - tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**) - 93.95% tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) - tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) - 98.16% operator new[](unsigned long) 29.20% impala::RowDescriptor::RowDescriptor(impala::RowDescriptor const&) 16.85% kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >) 12.58% impala::DataStreamRecvr::SenderQueue::AddBatch(std::unique_ptr >&&) 7.42% kudu::rpc::OutboundTransfer::CreateForCallResponse(std::vector > const&, kudu::rpc::TransferCallbacks*) + 4.34% impala::Codec::CreateDecompressor(impala::MemPool*, bool, impala::THdfsCompression::type, boost::scoped_ptr*) 4.09% kudu::Trace::Trace() 3.79% std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator const&) + 3.59% kudu::rpc::InboundCall::InboundCall(kudu::rpc::Connection*) 2.66% void std::vector >::_M_emplace_back_aux(impala::MemPool::ChunkInfo&&) + 2.57% kudu::rpc::Connection::HandleIncomingCall(gscoped_ptr >) 2.04% std::vector >::reserve(unsigned long) 1.92% kudu::rpc::RequestHeader::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*) 1.91% kudu::rpc::RemoteMethodPB::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*) 1.48% kudu::rpc::Connection::ReadHandler(ev::io&, int) 0.87% kudu::HeapBufferAllocator::AllocateInternal(unsigned long, unsigned long, kudu::BufferAllocator*) 0.79% kudu::faststring::GrowArray(unsigned long) 0.72% kudu::rpc::OutboundTransfer::CreateForCallRequest(int, std::vector > const&, kudu::rpc::TransferCallbacks*) 0.69% kudu::rpc::Connection::QueueOutboundCall(std::shared_ptr const&) 0.69% kudu::ArenaBase::ArenaBase(unsigned long, unsigned long) 0.68% void std::vector::Component, std::default_delete::Component> >, std::allocator::Component, std::default_delete::Component> > > >::_M_emplace_back_aux >&&) 21.66% kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >) 19.52% impala::TransmitDataResponsePb::~TransmitDataResponsePb() 15.30% kudu::rpc::InboundCall::~InboundCall() 5.69% kudu::rpc::QueueTransferTask::Run(kudu::rpc::ReactorThread*) 3.97% std::unordered_map, std::equal_to, std::allocator > >::mapped_type EraseKeyReturnValuePtr, std::equal_to, std::allocator > >::mapped_type EraseKeyReturnValuePtr >) - 22.12% impala::RowBatch::RowBatch(impala::RowDescriptor const&, impala::InboundProtoRowBatch const&, impala::MemTracker*) impala::DataStreamRecvr::SenderQueue::AddBatch(std::unique_ptr >&&) 20.73% impala::TransmitDataResponsePb::~TransmitDataResponsePb() 9.98% kudu::rpc::InboundCall::~InboundCall() 6.32% kudu::rpc::QueueTransferTask::Run(kudu::rpc::ReactorThread*) 4.20% std::unordered_map, std::equal_to, std::allocator > >::mapped_type EraseKeyReturnValuePtr, std::equal_to, std::allocator > >::mapped_type EraseKeyReturnValuePtr
[jira] [Created] (IMPALA-5526) Add krb5 to toolchain
Henry Robinson created IMPALA-5526: -- Summary: Add krb5 to toolchain Key: IMPALA-5526 URL: https://issues.apache.org/jira/browse/IMPALA-5526 Project: IMPALA Issue Type: Sub-task Components: Backend Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson KRPC adds a compile-time dependency on libkrb5's headers. To guarantee that they're available in all build environments, we should add krb5 (from http://web.mit.edu/kerberos/dist/index.html) to the toolchain. libkrb5.so should be dynamically linked by default, to avoid creating a binary that has statically linked security dependencies (this is an issue for us at Cloudera as a vendor, but also a general antipattern). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5518) Allow row batches to be recycled (rather than reallocated) across datastream recvr threads.
Henry Robinson created IMPALA-5518: -- Summary: Allow row batches to be recycled (rather than reallocated) across datastream recvr threads. Key: IMPALA-5518 URL: https://issues.apache.org/jira/browse/IMPALA-5518 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.2 Reporter: Henry Robinson The {{DataStreamSender}} allocates row batches in whatever thread handles the {{TransmitData()}} RPC, but then deallocates them in the fragment instance thread. That is an anti-pattern for tcmalloc. Instead we should see if we can recycle the row batches where possible. We could try to 'pin' row batches to service threads, and give them each a thread-local ability to reallocate row batch data - the key is ensuring that the deallocations happen on the same thread, so we can't just give each sender a list of row batches because that sender may be handled by different service pool threads. Alternatively we can try to cut down on the number of allocations, but that's hard to do with cross-thread coordination. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5511) Add process start time to debug web page
Henry Robinson created IMPALA-5511: -- Summary: Add process start time to debug web page Key: IMPALA-5511 URL: https://issues.apache.org/jira/browse/IMPALA-5511 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Priority: Minor It's useful to know when a process last started - particularly if a monitoring tool restarts the process automatically. There's a metric in the impalad process, but neither the statestore not the catalog server have it, and it's not displayed prominently. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5506) Help information of query_file option in impala-shell misses stdin description
[ https://issues.apache.org/jira/browse/IMPALA-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5506. Resolution: Fixed Assignee: David Xu Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/e2532a96c81ecfa2dc763306e96eb340fb49afe3 > Help information of query_file option in impala-shell misses stdin description > -- > > Key: IMPALA-5506 > URL: https://issues.apache.org/jira/browse/IMPALA-5506 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 2.5.0 >Reporter: David Xu >Assignee: David Xu >Priority: Minor > Fix For: Impala 2.10.0 > > > Help information of query_file option in impala-shell is described as > following: > Execute the queries in the query file , delimited by ; > But the code of impala-shell supports stdin indicated by -. I tested such > case and the results were correct. > We should add the stdin description to help information of query_file option > to guide user to use this feature. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5495) Improve error message if neither --is_coordinator nor --is_executor is set
[ https://issues.apache.org/jira/browse/IMPALA-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5495. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/11ec9f1958482bbd5dc224f55e409a8ec907f066 > Improve error message if neither --is_coordinator nor --is_executor is set > -- > > Key: IMPALA-5495 > URL: https://issues.apache.org/jira/browse/IMPALA-5495 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Trivial > Labels: bugbash-2017-05-31 > Fix For: Impala 2.10.0 > > > If neither {{is_coordinator}} nor {{is_executor}} are set, you get this > message {{Impala server needs to have a role (EXECUTOR, COORDINATOR)}} - > which isn't actionable. We should mention the flags at least. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5495) Improve error message if neither --is_coordinator nor --is_executor is set
Henry Robinson created IMPALA-5495: -- Summary: Improve error message if neither --is_coordinator nor --is_executor is set Key: IMPALA-5495 URL: https://issues.apache.org/jira/browse/IMPALA-5495 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Priority: Trivial If neither {{is_coordinator}} nor {{is_executor}} are set, you get this message {{Impala server needs to have a role (EXECUTOR, COORDINATOR)}} - which isn't actionable. We should mention the flags at least. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5493) Add Protobuf headers to Impala-lzo
Henry Robinson created IMPALA-5493: -- Summary: Add Protobuf headers to Impala-lzo Key: IMPALA-5493 URL: https://issues.apache.org/jira/browse/IMPALA-5493 Project: IMPALA Issue Type: Sub-task Components: Distributed Exec Reporter: Henry Robinson Assignee: Henry Robinson LZO now depends on Protobuf headers (transitively) - see: https://github.com/henryr/impala-lzo/commit/861e6b68011181257816c990465fca15250fcfa5 for a commit to include them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5133) Concurrent TPC-DS queries get stuck and stop making progress, new queries
[ https://issues.apache.org/jira/browse/IMPALA-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5133. Resolution: Fixed Believe this to be caused by KUDU-2041. Will re-open if we see it again. > Concurrent TPC-DS queries get stuck and stop making progress, new queries > -- > > Key: IMPALA-5133 > URL: https://issues.apache.org/jira/browse/IMPALA-5133 > Project: IMPALA > Issue Type: Sub-task >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson >Priority: Critical > Attachments: impalad.INFO.zip, stuck_queries_thread_dump_2.txt, > stuck_queries_thread_dump.txt, TPC-DS 2.zip, TPCDS-Concurrency-20Node.jmx > > > Concurrent queries against 10GB TPC-DS using 20 concurrent eventually get > stuck and don't make progress. > Attached impalad log and thread dump. > Jmeter JMX file used to run the workload is also attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5486) Port control-plane parts of ImpalaInternalService to KRPC
Henry Robinson created IMPALA-5486: -- Summary: Port control-plane parts of ImpalaInternalService to KRPC Key: IMPALA-5486 URL: https://issues.apache.org/jira/browse/IMPALA-5486 Project: IMPALA Issue Type: Sub-task Reporter: Henry Robinson -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-5485) Exchg / data-stream recv-side buffers should be tracked by query mem-tracker
Henry Robinson created IMPALA-5485: -- Summary: Exchg / data-stream recv-side buffers should be tracked by query mem-tracker Key: IMPALA-5485 URL: https://issues.apache.org/jira/browse/IMPALA-5485 Project: IMPALA Issue Type: Improvement Components: Distributed Exec Reporter: Henry Robinson Exchange nodes assign a fixed-size buffer to their datastream receivers that's used to smooth out differences in send / consume rates between the sender and the receiver. These buffers should be tracked by the query memtracker, and with the new min-reservation support we should allow them to be larger than the configured minimum. Increasing the buffer size decreases the amount of time that a sender can be blocked on a receiver, and so increases query-parallelism. Queries that shuffle a lot of data can see significant speedups from larger buffers. The buffers need to be sized based on the #receivers and the #rows * #avg row size. They can dynamically expand trivially - contraction is possible, but a bit harder. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-5480) Missing filters message isn't great
[ https://issues.apache.org/jira/browse/IMPALA-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5480. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/fa174fc962c174598ee41e558aff33698753d9f5 > Missing filters message isn't great > --- > > Key: IMPALA-5480 > URL: https://issues.apache.org/jira/browse/IMPALA-5480 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 2.7.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Trivial > Fix For: Impala 2.10.0 > > > If a runtime filter doesn't arrive at a scan node, the message text is a bit > hard to read: > {{Only following filters arrived: , waited 10ms}} > Let's change it to make clear that 0 filters have arrived, and which ones > *have* shown up. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5345) Under stress, some TransmitData() RPCs are not responded to
[ https://issues.apache.org/jira/browse/IMPALA-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5345. Resolution: Fixed Fix Version/s: Impala 2.10.0 Can't reproduce this on either a 20-node or 140-node cluster since I fixed KUDU-2041. My guess is that connection deadlock looked like a half-complete RPC failure. Have to run stress as well to confirm, but current indication is that this is fixed. > Under stress, some TransmitData() RPCs are not responded to > --- > > Key: IMPALA-5345 > URL: https://issues.apache.org/jira/browse/IMPALA-5345 > Project: IMPALA > Issue Type: Sub-task >Affects Versions: Impala 2.10.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Critical > Fix For: Impala 2.10.0 > > > Under stress conditions on two separate clusters (one secure, one not), I've > seen some {{TransmitData()}} RPCs stay unresponded to forever, blocking the > query's completion. The RPCs are seen by the recipient, but are not in the > pending sender list. > Need to test further to see if this is related to the fix for IMPALA-5093 or > if a response is dropped on some path if an row batch is 'retried' from the > pending sender list. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5481) RowBatches should share RowDescriptor where possible
Henry Robinson created IMPALA-5481: -- Summary: RowBatches should share RowDescriptor where possible Key: IMPALA-5481 URL: https://issues.apache.org/jira/browse/IMPALA-5481 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.10.0 Reporter: Henry Robinson One of the {{RowBatch}} c'tors copies the row descriptor into the row batch. This leads to a lot of allocation churn since {{RowDescriptor}} contains some vector members, and since the descriptor is usually the same the copies are unnecessary. Instead, we should consider allocating the {{RowDescriptor}} once from an object pool, and sharing it amongst all row batches that need that descriptor. In some tests, {{RowDescriptor()}} shows up as 20% of the tcmalloc allocation time. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5480) Missing filters message isn't great
Henry Robinson created IMPALA-5480: -- Summary: Missing filters message isn't great Key: IMPALA-5480 URL: https://issues.apache.org/jira/browse/IMPALA-5480 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.7.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Trivial If a runtime filter doesn't arrive at a scan node, the message text is a bit hard to read: {{Only following filters arrived: , waited 10ms}} Let's change it to make clear that 0 filters have arrived, and which ones *have* shown up. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-4892) Include the session ID in the "Invalid session ID" error message
[ https://issues.apache.org/jira/browse/IMPALA-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4892. Resolution: Fixed Fixed in https://github.com/apache/incubator-impala/commit/eea4ad7caa8cf6ab7cea125e9564392d63ea2c27. Thanks for the contribution [~sjc362000]! > Include the session ID in the "Invalid session ID" error message > > > Key: IMPALA-4892 > URL: https://issues.apache.org/jira/browse/IMPALA-4892 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Stephen Carlin > Labels: newbie > > When {{GetSessionState()}} can't find the session, the error message should > include the ID that was wrong. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5435) test_basic_filters failed on ASAN
[ https://issues.apache.org/jira/browse/IMPALA-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5435. Resolution: Fixed Fix Version/s: Impala 2.10.0 Fixed (by increasing timeouts) in https://github.com/apache/incubator-impala/commit/1886da45e87209c0d625554462d68e2b44bb > test_basic_filters failed on ASAN > - > > Key: IMPALA-5435 > URL: https://issues.apache.org/jira/browse/IMPALA-5435 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.10.0 >Reporter: Thomas Tauber-Marshall >Assignee: Henry Robinson >Priority: Critical > Labels: broken-build > Fix For: Impala 2.10.0 > > > Seen in an ASAN Jenkins build: > {noformat} > 02:45:29 TestRuntimeFilters.test_basic_filters[exec_option: > {'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | > table_format: rc/snap/block] > 02:45:29 [gw3] linux2 -- Python 2.6.6 > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python > 02:45:29 query_test/test_runtime_filters.py:39: in test_basic_filters > 02:45:29 self.run_test_case('QueryTest/runtime_filters', vector) > 02:45:29 common/impala_test_suite.py:430: in run_test_case > 02:45:29 verify_runtime_profile(test_section['RUNTIME_PROFILE'], > result.runtime_profile) > 02:45:29 common/test_result_verifier.py:560: in verify_runtime_profile > 02:45:29 actual)) > 02:45:29 E AssertionError: Did not find matches for lines in runtime > profile: > 02:45:29 E EXPECTED LINES: > 02:45:29 E row_regex: .*Files rejected: 7 .* > 02:45:29 E > 02:45:29 E ACTUAL PROFILE: > 02:45:29 E Query (id=364393521d6edaa6:82f92a03): > 02:45:29 E DEBUG MODE WARNING: Query profile created while running a > DEBUG build of Impala. Use RELEASE builds to measure query performance. > 02:45:29 E Summary: > 02:45:29 E Session ID: be475affeee5db0d:e52699dc51ac26ae > 02:45:29 E Session Type: BEESWAX > 02:45:29 E Start Time: 2017-06-05 00:31:12.430322000 > 02:45:29 E End Time: > 02:45:29 E Query Type: QUERY > 02:45:29 E Query State: FINISHED > 02:45:29 E Query Status: OK > 02:45:29 E Impala Version: impalad version 2.9.0-SNAPSHOT DEBUG (build > cde19ab8c7801436070ce0438e28d5042265dfd1) > 02:45:29 E User: jenkins > 02:45:29 E Connected User: jenkins > 02:45:29 E Delegated User: > 02:45:29 E Network Address: 127.0.0.1:40832 > 02:45:29 E Default Db: functional_rc_snap > 02:45:29 E Sql Statement: with t1 as (select month x, bigint_col y from > alltypes limit 7300), > 02:45:29 Et2 as (select int_col x, bigint_col y from alltypestiny > limit 2) > 02:45:29 Eselect count(*) from t1, t2 where t1.x = t2.x > 02:45:29 E Coordinator: > impala-boost-static-burst-slave-1fc7.vpc.cloudera.com:22000 > 02:45:29 E Query Options (non default): > ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=15000 > 02:45:29 E Plan: > 02:45:29 E > 02:45:29 E Per-Host Resource Reservation: Memory=136.00MB > 02:45:29 E Per-Host Resource Estimates: Memory=138.00MB > 02:45:29 E WARNING: The following tables are missing relevant table and/or > column statistics. > 02:45:29 E functional_rc_snap.alltypes, functional_rc_snap.alltypestiny > 02:45:29 E > 02:45:29 E F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > 02:45:29 E PLAN-ROOT SINK > 02:45:29 E | mem-estimate=0B mem-reservation=0B > 02:45:29 E | > 02:45:29 E 03:AGGREGATE [FINALIZE] > 02:45:29 E | output: count(*) > 02:45:29 E | mem-estimate=10.00MB mem-reservation=0B > 02:45:29 E | tuple-ids=4 row-size=8B cardinality=1 > 02:45:29 E | > 02:45:29 E 02:HASH JOIN [INNER JOIN, BROADCAST] > 02:45:29 E | hash predicates: month = int_col > 02:45:29 E | runtime filters: RF000 <- int_col > 02:45:29 E | mem-estimate=9B mem-reservation=136.00MB > 02:45:29 E | tuple-ids=0,2 row-size=8B cardinality=7300 > 02:45:29 E | > 02:45:29 E |--06:EXCHANGE [UNPARTITIONED] > 02:45:29 E | | mem-estimate=0B mem-reservation=0B > 02:45:29 E | | tuple-ids=2 row-size=4B cardinality=2 > 02:45:29 E | | > 02:45:29 E | F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > 02:45:29 E | 05:EXCHANGE [UNPARTITIONED] > 02:45:29 E | | limit: 2 > 02:45:29 E | | mem-estimate=0B mem-reservation=0B > 02:45:29 E | | tuple-ids=2 row-size=4B cardinality=2 > 02:45:29 E | | > 02:45:29 E | F02:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > 02:45:29 E | 01:SCAN HDFS [functional_rc_snap.alltypestiny, RANDOM] > 02:45:29 E | partitions=4/4 files=4 size=1.38KB > 02:45:29
[jira] [Resolved] (IMPALA-5454) JVM metrics don't show up on /memz sometimes
[ https://issues.apache.org/jira/browse/IMPALA-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5454. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/88f1f86b3d2f9ed23a94e285bad6bf09f8c80c93 > JVM metrics don't show up on /memz sometimes > > > Key: IMPALA-5454 > URL: https://issues.apache.org/jira/browse/IMPALA-5454 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Due to a bug in {{mustache.cc}}, {{memz.tmpl}} template rendering fails if > the {{buffer_pool}} JSON entry is not generated. > The bug is that nested template commands with the same key aren't correctly > parsed: > {code} > {{?b}} {{#b}} {{/b}} {{/b}} > {code} > If '{{b}}' is not present, the parser tries to skip to the closing > {code}{{/b}}{code}, but does not take into account nesting and matches the > first closing {code}{{/b}}{code}, not the second. Parsing then fails. > This JIRA is to track the workaround - rewriting the templates not to use > nesting. The [upstream project|https://github.com/HenryR/cpp-mustache] can > fix the underlying issue independently. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5473) Make diagnosing network issues easier
Henry Robinson created IMPALA-5473: -- Summary: Make diagnosing network issues easier Key: IMPALA-5473 URL: https://issues.apache.org/jira/browse/IMPALA-5473 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.10.0 Reporter: Henry Robinson With our current metrics in the profile, it's hard to debug queries that get slow throughput from their exchanges. The following cases have different causes, but similar symptoms (e.g. a high {{InactiveTimer}} in the xchg profile): 1. Downstream sender does not produce rows quickly (perhaps because *its* child instances do not produce rows quickly). 2. Downstream sender can not _send_ rows quickly, perhaps because of network congestion. 3. Downstream sender does not start producing rows until some time after the upstream has started (captured by {{FirstBatchArrivalWaitTime}}). 4. Downstream sender does not close stream until some time after all rows are sent. We should try to improve these metrics so that all the information about who is slow, and why, is available clearly in the runtime profile. Distinguishing cases 1 and 2 is particularly important. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5056) Impala fails to recover from statestore connection loss while waiting for metadata
[ https://issues.apache.org/jira/browse/IMPALA-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5056. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/54118010590d8605303715bf570cac18e1d5e64e > Impala fails to recover from statestore connection loss while waiting for > metadata > -- > > Key: IMPALA-5056 > URL: https://issues.apache.org/jira/browse/IMPALA-5056 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Frontend >Affects Versions: Impala 2.8.0 >Reporter: Balazs Jeszenszky >Assignee: Henry Robinson >Priority: Critical > Labels: supportability, usability > Fix For: Impala 2.10.0 > > > The following sequence: > {code:java} > describe t1; > shut down statestore > invalidate metadata t1; > ***describe t1; > start ST > {code} > makes the marked query hang indefinitely. New queries will work, but this > query will be stuck in the planning phase. Trying to cancel the query or open > the details will bring down the web UI since it will wait for the lock the > query is holding. The only way to kill off these queries is a restart. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5135) KRPC : ReportExecStatus RPC can timeout when deserializing large query profiles due to tcmalloc contention
[ https://issues.apache.org/jira/browse/IMPALA-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5135. Resolution: Not A Bug {{ReportExecStatus}} doesn't use KRPC any more. When we port those RPCs, we'll revisit performance but the locking will have changed a lot. > KRPC : ReportExecStatus RPC can timeout when deserializing large query > profiles due to tcmalloc contention > -- > > Key: IMPALA-5135 > URL: https://issues.apache.org/jira/browse/IMPALA-5135 > Project: IMPALA > Issue Type: Sub-task >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson >Priority: Critical > Attachments: bottomup.txt, KRPC Performance results - Concurrent > TPC-DS Q17 CDH5.12 Vs. KRPC.pdf, top-down.txt.zip > > > Queries with a larger number of fragments can fail with {code}Timed out: > ReportExecStatus RPC to 10.17.187.36:22000 timed out after 10.000s > (SENT){code} > Vtune shows that while deserializing the query profile the thread can get > stuck in tcmalloc > {code} > impalad ! tcmalloc::CentralFreeList::FetchFromOneSpans - [unknown source file] > impalad ! tcmalloc::CentralFreeList::RemoveRange + 0xc0 - [unknown source > file] > impalad ! tcmalloc::ThreadCache::FetchFromCentralCache + 0x62 - [unknown > source file] > impalad ! operator new + 0x297 - [unknown source file] > impalad ! __gnu_cxx::new_allocator::allocate + 0x4 - > new_allocator.h:104 > impalad ! std::vector std::allocator>::resize + 0x7f3 - stl_vector.h:676 > impalad ! impala::TRuntimeProfileNode::read + 0xe5a - > RuntimeProfile_types.cpp:601 > impalad ! impala::TRuntimeProfileTree::read + 0x8ac - > RuntimeProfile_types.cpp:982 > impalad ! impala::TReportExecStatusParams::read + 0x156 - > ImpalaInternalService_types.cpp:2956 > impalad ! impala::DeserializeThriftMsg + > 0xe5 - thrift-util.h:145 > impalad ! impala::DeserializeFromSidecar + > 0xb7 - rpc.h:407 > impalad ! impala::ExecControlService::ReportExecStatus + 0x21e - > impala-internal-service.cc:148 > impalad ! std::function google::protobuf::Message*, kudu::rpc::RpcContext*)>::operator() + 0x1c - > functional:2439 > impalad ! kudu::rpc::GeneratedServiceIf::Handle + 0x188 - service_if.cc:134 > impalad ! impala::ImpalaServicePool::RunThread + 0x241 - > impala-service-pool.cc:130 > impalad ! boost::function0::operator() + 0x1a - > function_template.hpp:767 > impalad ! impala::Thread::SuperviseThread + 0x20e - thread.cc:325 > impalad ! operator()&, const > std::basic_string&, boost::function, impala::Promise int>*), boost::_bi::list0> + 0x5a - bind.hpp:457 > impalad ! boost::_bi::bind_t const&, boost::function, impala::Promise*), > boost::_bi::list4, > boost::_bi::value, boost::_bi::value (void)>>, boost::_bi::value*>>>::operator() - > bind_template.hpp:20 > impalad ! boost::detail::thread_data (*)(std::string const&, std::string const&, boost::function, > impala::Promise*), boost::_bi::list4, > boost::_bi::value, boost::_bi::value (void)>>, boost::_bi::value*::run + 0x19 - > thread.hpp:116 > impalad ! thread_proxy + 0xd9 - [unknown source file] > libpthread.so.0 ! start_thread + 0xd0 - [unknown source file] > libc.so.6 ! clone + 0x6c - [unknown source file] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5134) KRPC : Query with 2K fragments on un-secure 16 node cluster failed with ReportExecStatus RPC to 10.20.122.112:22000 timed out after 10.000s (ON_OUTBOUND_QUEUE)
[ https://issues.apache.org/jira/browse/IMPALA-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5134. Resolution: Not A Bug {{ReportExecStatus}} doesn't use KRPC any more. When we port those RPCs, we'll revisit performance but the locking will have changed a lot. > KRPC : Query with 2K fragments on un-secure 16 node cluster failed with > ReportExecStatus RPC to 10.20.122.112:22000 timed out after 10.000s > (ON_OUTBOUND_QUEUE) > --- > > Key: IMPALA-5134 > URL: https://issues.apache.org/jira/browse/IMPALA-5134 > Project: IMPALA > Issue Type: Sub-task >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson > > Error message varies from run to run > EndDataStream RPC to 10.17.193.18:22000 timed out after 10.000s (SENT) > EndDataStream RPC to 10.17.193.18:22000 timed out after 10.000s > (ON_OUTBOUND_QUEUE) > ExecPlanFragment RPC to 10.20.122.112:22000 timed out after 120.000s (SENT) > Captured Vtune data while the query was running and noticed that the RPC can > spend significantl amount of time in tcmalloc which eventually spins in the > kernel, this behavior can lead to unexpected RPC timeouts. > {code} > CPU Time > 2 of 101: 9.5% (2.435s of 25.740s) > libc.so.6 ! madvise - [unknown source file] > impalad ! TCMalloc_SystemRelease + 0x79 - [unknown source file] > impalad ! tcmalloc::PageHeap::DecommitSpan + 0x20 - [unknown source file] > impalad ! tcmalloc::PageHeap::MergeIntoFreeList + 0x212 - [unknown source > file] > impalad ! tcmalloc::PageHeap::Delete + 0x23 - [unknown source file] > impalad ! operator delete + 0x123 - [unknown source file] > impalad ! ~faststring + 0x15 - faststring.h:54 > impalad ! ~InboundTransfer - transfer.h:65 > impalad ! kudu::DefaultDeleter::operator() - > gscoped_ptr.h:145 > impalad ! ~gscoped_ptr_impl + 0x9 - gscoped_ptr.h:228 > impalad ! ~gscoped_ptr - gscoped_ptr.h:318 > impalad ! kudu::rpc::InboundCall::~InboundCall + 0xe7 - inbound_call.cc:51 > impalad ! kudu::DefaultDeleter::operator() + 0x7 - > gscoped_ptr.h:145 > impalad ! ~gscoped_ptr_impl + 0x9 - gscoped_ptr.h:228 > impalad ! ~gscoped_ptr - gscoped_ptr.h:318 > impalad ! ~ResponseTransferCallbacks + 0x30 - connection.cc:368 > impalad ! ~ResponseTransferCallbacks - connection.cc:373 > impalad ! kudu::rpc::ResponseTransferCallbacks::NotifyTransferFinished + 0x1e > - connection.cc:376 > impalad ! kudu::rpc::OutboundTransfer::SendBuffer + 0x1b9 - transfer.cc:221 > impalad ! kudu::rpc::Connection::WriteHandler + 0x156 - connection.cc:596 > impalad ! ev_invoke_pending + 0x52 - [unknown source file] > impalad ! ev_run + 0x9c3 - [unknown source file] > impalad ! ev::loop_ref::run + 0x12 - ev++.h:211 > impalad ! kudu::rpc::ReactorThread::RunThread + 0x3 - reactor.cc:316 > impalad ! boost::function0::operator() + 0x1a - > function_template.hpp:767 > impalad ! kudu::Thread::SuperviseThread + 0x1ee - thread.cc:590 > libpthread.so.0 ! start_thread + 0xd0 - [unknown source file] > libc.so.6 ! clone + 0x6c - [unknown source file] > {code} > Query > {code} > select /* +straight_join */ count(*),a.c_nationkey, max(b.c_comment) from > customer A join /* +shuffle */ customer B on A.c_custkey = B.c_custkey join > /* +shuffle */ customer C on c.c_custkey = B.c_custkey join /* +shuffle */ > customer D on d.c_custkey = B.c_custkey join /* +shuffle */ customer E on > e.c_custkey = B.c_custkey join /* +shuffle */ customer F on f.c_custkey = > B.c_custkey join /* +shuffle */ customer G on g.c_custkey = B.c_custkey > join /* +shuffle */ customer H on h.c_custkey = B.c_custkey join /* > +shuffle */ customer I on i.c_custkey = B.c_custkey join /* +shuffle */ > customer J on j.c_custkey = B.c_custkey join /* +shuffle */ customer K on > k.c_custkey = B.c_custkey join /* +shuffle */ customer L on l.c_custkey = > B.c_custkey join /* +shuffle */ customer M on m.c_custkey = B.c_custkey > join /* +shuffle */ customer N on n.c_custkey = B.c_custkey join /* > +shuffle */ customer O on o.c_custkey = B.c_custkey join /* +shuffle */ > customer P on p.c_custkey = B.c_custkey join /* +shuffle */ customer R on > R.c_custkey = B.c_custkey join /* +shuffle */ customer S on S.c_custkey = > B.c_custkey join /* +shuffle */ customer T on T.c_custkey = B.c_custkey > join /* +shuffle */ customer U on U.c_custkey = B.c_custkey join /* > +shuffle */ customer V on V.c_custkey = B.c_custkey join /* +shuffle */ > customer W on W.c_custkey = B.c_custkey join /* +shuffle */ customer X on > X.c_custkey = B.c_custkey join /* +shuffle */ customer Y on Y.c_custkey = > B.c_custkey join /* +shuffle */ customer Z on Z.c_custkey = B.c_custkey > join /* +shuffle */ customer
[jira] [Created] (IMPALA-5454) JVM metrics don't show up on /memz sometimes
Henry Robinson created IMPALA-5454: -- Summary: JVM metrics don't show up on /memz sometimes Key: IMPALA-5454 URL: https://issues.apache.org/jira/browse/IMPALA-5454 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Due to a bug in {{mustache.cc}}, {{memz.tmpl}} template rendering fails if the {{buffer_pool}} JSON entry is not generated. The bug is that nested template commands with the same key aren't correctly parsed: {code} {{?b}} {{#b}} {{/b}} {{/b}} {code} If '{{b}}' is not present, the parser tries to skip to the closing {code}{{/b}}{code}, but does not take into account nesting and matches the first closing {code}{{/b}}{code}}, not the second. Parsing then fails. This JIRA is to track the workaround - rewriting the templates not to use nesting. The [upstream project|https://github.com/HenryR/cpp-mustache] can fix the underlying issue independently. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5450) Impala web UI /varz?raw returns HTML content
[ https://issues.apache.org/jira/browse/IMPALA-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5450. Resolution: Not A Bug This is working as intended - the {{raw}} argument really just serves to change the content-type. I think this did change around Impala 2.7 to make the behaviour more consistent, and several web pages were changed to use templates ({{/varz}} amongst them). > Impala web UI /varz?raw returns HTML content > > > Key: IMPALA-5450 > URL: https://issues.apache.org/jira/browse/IMPALA-5450 > Project: IMPALA > Issue Type: Bug >Reporter: Jim Halfpenny >Priority: Minor > > In previous versions on Impala HTTP requests to http://impalad:25000/varz?raw > returned the command line options used for impalad. On 2.7 it instead returns > the same HTML as /varz, but with Content-Type set to text/plain. It's > possible the command line options were removed for security reasons. If so > returning a blank document would be preferable to returning the normal page > HTML. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5377) IMPALAD Crashed With the impala starting large number of JDBC accessing
[ https://issues.apache.org/jira/browse/IMPALA-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5377. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/f3fdd4d4a8025bab1d2babe9772252f1703a60ee > IMPALAD CrashedWith the impala starting large number of JDBC accessing > -- > > Key: IMPALA-5377 > URL: https://issues.apache.org/jira/browse/IMPALA-5377 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 > Environment: Apache impala branch > eb54287fb4c635c8fc6c96872e87ad5a98b16339 >Reporter: yyzzjj >Assignee: Henry Robinson >Priority: Critical > Fix For: Impala 2.10.0 > > > from the symptom point of view like this > query access before ExecEnv::StartServices() which init mem_tracker_ > (gdb) bt > #0 impala::MemTracker::CheckLimitExceeded (this=0x0) at > /export/ldb/online/impala_master/be/src/runtime/mem-tracker.h:331 > #1 impala::MemTracker::LimitExceeded (this=0x0) at > /export/ldb/online/impala_master/be/src/runtime/mem-tracker.h:234 > #2 impala::QueryState::Init (this=this@entry=0x973e400, rpc_params=...) at > /export/ldb/online/impala_master/be/src/runtime/query-state.cc:98 > #3 0x00cf550a in impala::QueryExecMgr::StartQuery (this=0xb137aa0, > params=...) at > /export/ldb/online/impala_master/be/src/runtime/query-exec-mgr.cc:51 > #4 0x00d7e020 in impala::ImpalaInternalService::ExecQueryFInstances > (this=0xa07bf00, return_val=..., params=...) at > /export/ldb/online/impala_master/be/src/service/impala-internal-service.cc:50 > #5 0x00fcbfb4 in > impala::ImpalaInternalServiceProcessor::process_ExecQueryFInstances > (this=0xbced020, seqid=1, iprot=, oprot=0x96c8e40, > connectionContext=) > at > /export/ldb/online/impala_master/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:1433 > #6 0x00fcb326 in > impala::ImpalaInternalServiceProcessor::dispatchCall (this=0xbced020, > iprot=0x96c8e70, oprot=0x96c8e40, fname=..., seqid=1, > connectionContext=0xbf138d0) > at > /export/ldb/online/impala_master/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:1403 > #7 0x008c52cc in apache::thrift::TDispatchProcessor::process > (this=0xbced020, in=..., out=..., connectionContext=0xbf138d0) > at > /export/ldb/online/impala_master/thirdparty/fbthrift-2016.12.19.00/build/include/thrift/lib/cpp/TDispatchProcessor.h:124 > #8 0x7f05f6c1901f in apache::thrift::server::TThreadedServer::Task::run > (this=0xbf13880) at server/TThreadedServer.cpp:65 > #9 0x7f05f6c25594 in > apache::thrift::concurrency::PthreadThread::threadMain (arg=) > at concurrency/PosixThreadFactory.cpp:194 > #10 0x003cd3c079d1 in start_thread () from /lib64/libpthread.so.0 > #11 0x003cd34e886d in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5433) Mark Status c'tors as explicit
[ https://issues.apache.org/jira/browse/IMPALA-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5433. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/f0065d376f9a0cfbefc20164c87137df56363166 > Mark Status c'tors as explicit > -- > > Key: IMPALA-5433 > URL: https://issues.apache.org/jira/browse/IMPALA-5433 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > {{Status}} has lots of constructors. Marking them as explicit will help avoid > unexpected programming errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5441) Send larger row batches over the wire
Henry Robinson created IMPALA-5441: -- Summary: Send larger row batches over the wire Key: IMPALA-5441 URL: https://issues.apache.org/jira/browse/IMPALA-5441 Project: IMPALA Issue Type: Improvement Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Our on-the-wire row batch size is the same as the in-memory size (1024 rows by default). It might make sense to increase the wire-size to reduce the RPC-per-row overhead, and decrease context-switching in the receiver. KRPC makes it quite natural to do that: each row batch can be serialized as a sidecar in memory-size batches. The receiver can then read each batch in turn as though it were sent individually, without any need to stitch together (or split up) serialized batches. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5433) Mark Status c'tors as explicit
Henry Robinson created IMPALA-5433: -- Summary: Mark Status c'tors as explicit Key: IMPALA-5433 URL: https://issues.apache.org/jira/browse/IMPALA-5433 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor {{Status}} has lots of constructors. Marking them as explicit will help avoid unexpected programming errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5350) Build threads should include fragment ID in their names
[ https://issues.apache.org/jira/browse/IMPALA-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5350. Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/9caea9bfad025274762642a03cb5483625d86a09 > Build threads should include fragment ID in their names > --- > > Key: IMPALA-5350 > URL: https://issues.apache.org/jira/browse/IMPALA-5350 > Project: IMPALA > Issue Type: Improvement >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: Impala 2.10.0 > > > Just like fragment executor threads do, the build threads should include the > fragment instance ID in their name so that it's easy to map entries in > {{/threadz}} back onto the query and fragment they belong to without a > debugger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5364) Number of fragments reported in the web-ui is incorrect
[ https://issues.apache.org/jira/browse/IMPALA-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5364. Resolution: Fixed Fix Version/s: Impala 2.9.0 https://github.com/apache/incubator-impala/commit/64e8538ab2a45794821fcf7b84160fdc334e6505 > Number of fragments reported in the web-ui is incorrect > > > Key: IMPALA-5364 > URL: https://issues.apache.org/jira/browse/IMPALA-5364 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson >Priority: Minor > Labels: supportability, web-ui > Fix For: Impala 2.9.0 > > Attachments: Screen Shot 2017-05-24 at 5.07.33 PM.png > > > Number of fragments perf backend reported in the web-ui is incorrect. > It appears to be displaying the number of queries instead, during that > snapshot there was +1K fragments running on the cluster. > {code} > Query Locations > Location Number of Fragments > s-11.foo.com:2200015 > s-12.foo.com:2200015 > s-10.foo.com:2200015 > s-14.foo.com:2200015 > s-16.foo.com:2200015 > s-13.foo.com:2200015 > s-09.foo.com:2200015 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5405) Catalog will not send full update of catalog topic when statestore restarts
Henry Robinson created IMPALA-5405: -- Summary: Catalog will not send full update of catalog topic when statestore restarts Key: IMPALA-5405 URL: https://issues.apache.org/jira/browse/IMPALA-5405 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Henry Robinson If: * No DDL operations have happened since the last cluster restart * The statestore is restarted The catalog will not re-publish its metadata topic. Any new Impala daemons won't get updates, and won't be able to accept queries. For a minimal repro, start a cluster. Wait for metadata to be loaded (i.e. you can run a query), and then restart the statestore. After 30s or so, check {{/topics}} on the statestore's UI - the {{catalog-update}} topic will exist, but will have 0 entries. The bug appears to be in [this code|https://github.com/apache/incubator-impala/blob/master/be/src/catalog/catalog-server.cc#L230] in the catalog: {code} if (delta.from_version == 0 && delta.to_version == 0 && catalog_objects_min_version_ != 0) { catalog_topic_entry_keys_.clear(); last_sent_catalog_version_ = 0L; } else { // .. publish intermediate update } {code} When the statestore restarts and sends the first topic update for the catalog topic, {{catalog_min_update_}} may be {{0}}, so the first branch which is for publishing the complete metadata topic is not taken. If any DDL operations have happened on the cluster, {{catalog_min_update_}} becomes non-zero, and the bug is no longer hit. *Workaround* Either a) Trigger metadata publication by running {{INVALIDATE METADATA}} or {{REFRESH }}, or b) restart {{catalogd}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5367) PlannerTest failure when upgrading from Ubuntu 14.04 to 16.04
[ https://issues.apache.org/jira/browse/IMPALA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5367. Resolution: Duplicate This is probably a duplicate of IMPALA-5358. Feel free to reopen if not addressed by the patch already in flight for that jira. > PlannerTest failure when upgrading from Ubuntu 14.04 to 16.04 > - > > Key: IMPALA-5367 > URL: https://issues.apache.org/jira/browse/IMPALA-5367 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Jim Apple >Assignee: Alexander Behm > > The following test fails when upgrading from Ubuntu 14.04 to 16.04, even with > a fresh data load: > {noformat} > testTableSample(org.apache.impala.planner.PlannerTest) Time elapsed: 0.055 > sec <<< FAILURE! > java.lang.AssertionError: > Section PLAN of query: > select * from functional.alltypes tablesample system(50) repeatable(1234) > where year = 2009 > Actual does not match expected result: > F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > PLAN-ROOT SINK > | mem-estimate=0B mem-reservation=0B > | > 00:SCAN HDFS [functional.alltypes] >partitions=7/24 files=7 size=138.28KB > >table stats: 7300 rows total >column stats: all >mem-estimate=48.00MB mem-reservation=0B >tuple-ids=0 row-size=97B cardinality=1825 > Expected: > F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > PLAN-ROOT SINK > | mem-estimate=0B mem-reservation=0B > | > 00:SCAN HDFS [functional.alltypes] >partitions=6/24 files=6 size=119.04KB >table stats: 7300 rows total >column stats: all >mem-estimate=48.00MB mem-reservation=0B >tuple-ids=0 row-size=97B cardinality=1825 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5358) Off-by-one error in testTableSample
Henry Robinson created IMPALA-5358: -- Summary: Off-by-one error in testTableSample Key: IMPALA-5358 URL: https://issues.apache.org/jira/browse/IMPALA-5358 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Alexander Behm Priority: Critical Number of partitions scanned is 12, but expected to be 13. This was in a build with legacy aggs and joins enabled, but it's not obvious if those are related. {code} FAILED: org.apache.impala.planner.PlannerTest.testTableSample Error Message: Section PLAN of query: select * from functional.alltypes tablesample system(50) repeatable(1234) Actual does not match expected result: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB ^^ table stats: 7300 rows total column stats: all mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Expected: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=12/24 files=12 size=240.27KB table stats: 7300 rows total column stats: all mem-estimate=80.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB table stats: 7300 rows total column stats: all mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Section PLAN of query: select * from functional.alltypes tablesample system(50) repeatable(1234) where id < 10 Actual does not match expected result: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB ^^ predicates: id < 10 table stats: 7300 rows total column stats: all parquet dictionary predicates: id < 10 mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=365 Expected: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=12/24 files=12 size=239.26KB predicates: id < 10 table stats: 7300 rows total column stats: all parquet dictionary predicates: id < 10 mem-estimate=80.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=365 Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB predicates: id < 10 table stats: 7300 rows total column stats: all parquet dictionary predicates: id < 10 mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=365 Stack Trace: java.lang.AssertionError: Section PLAN of query: select * from functional.alltypes tablesample system(50) repeatable(1234) Actual does not match expected result: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB ^^ table stats: 7300 rows total column stats: all mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Expected: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=12/24 files=12 size=240.27KB table stats: 7300 rows total column stats: all mem-estimate=80.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB table stats: 7300 rows total column stats: all mem-estimate=96.00MB mem-reservation=0B tuple-ids=0 row-size=97B cardinality=3650 Section PLAN of query: select * from functional.alltypes tablesample system(50) repeatable(1234) where id < 10 Actual does not match expected result: F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 00:SCAN HDFS [functional.alltypes] partitions=13/24 files=13 size=258.44KB ^^ predicates: id < 10
[jira] [Created] (IMPALA-5350) Build threads should include fragment ID in their names
Henry Robinson created IMPALA-5350: -- Summary: Build threads should include fragment ID in their names Key: IMPALA-5350 URL: https://issues.apache.org/jira/browse/IMPALA-5350 Project: IMPALA Issue Type: Improvement Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Just like fragment executor threads do, the build threads should include the fragment instance ID in their name so that it's easy to map entries in {{/threadz}} back onto the query and fragment they belong to without a debugger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5349) BufferedBlockMgrTest.NoDirsAllocationError failed to write earlier than expected
Henry Robinson created IMPALA-5349: -- Summary: BufferedBlockMgrTest.NoDirsAllocationError failed to write earlier than expected Key: IMPALA-5349 URL: https://issues.apache.org/jira/browse/IMPALA-5349 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Tim Armstrong This is an ASAN build, which may affect timing: {code} 02:48:15 [ RUN ] BufferedBlockMgrTest.NoDirsAllocationError 02:48:15 /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/buffered-block-mgr-test.cc:1239: Failure 02:48:15 Value of: status_.ok() 02:48:15 Actual: false 02:48:15 Expected: true 02:48:15 Error: Could not create files in any configured scratch directories (--scratch_dirs). See logs for previous errors that may have prevented creating or writing scratch files. 02:48:15 Opening '/tmp/buffered-block-mgr-test.0/impala-scratch/0:0_c0071fba-577c-4c59-95be-a42db2c34721' for write failed with errno=13 description=Error(13): Permission denied 02:48:15 Opening '/tmp/buffered-block-mgr-test.1/impala-scratch/0:0_0d6b4e50-fb7f-44e8-8b59-76a6fd8d56dd' for write failed with errno=13 description=Error(13): Permission denied 02:48:15 02:48:15 *** Check failure stack trace: *** 02:48:15 @ 0x2c3cc26 google::DumpStackTraceAndExit() 02:48:15 @ 0x2c3361d google::LogMessage::Fail() 02:48:15 @ 0x2c34ec2 google::LogMessage::SendToLog() 02:48:15 @ 0x2c32ff7 google::LogMessage::Flush() 02:48:15 @ 0x2c365be google::LogMessageFatal::~LogMessageFatal() 02:48:15 @ 0x11a4999 impala::BufferedBlockMgr::~BufferedBlockMgr() 02:48:15 @ 0x11ae5ca std::_Sp_counted_ptr<>::_M_dispose() 02:48:15 @ 0x10bc8b5 std::_Sp_counted_base<>::_M_release() 02:48:15 @ 0x10b0414 std::__shared_ptr<>::reset() 02:48:15 @ 0x127fcad impala::RuntimeState::ReleaseResources() 02:48:15 @ 0x1238348 impala::TestEnv::TearDownQueries() 02:48:15 @ 0x10b4f11 impala::BufferedBlockMgrTest::TearDown() 02:48:15 @ 0x2caf923 testing::internal::HandleExceptionsInMethodIfSupported<>() 02:48:15 @ 0x2ca7249 testing::Test::Run() 02:48:15 @ 0x2ca73c8 testing::TestInfo::Run() 02:48:15 @ 0x2ca74a5 testing::TestCase::Run() 02:48:15 @ 0x2ca8728 testing::internal::UnitTestImpl::RunAllTests() 02:48:15 @ 0x2ca8a03 testing::UnitTest::Run() 02:48:15 @ 0x10a8827 main 02:48:15 @ 0x35d0e1ecdd (unknown) 02:48:15 @ 0xfb8d45 (unknown) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5345) Under stress, some TransmitData() RPCs are not responded to
Henry Robinson created IMPALA-5345: -- Summary: Under stress, some TransmitData() RPCs are not responded to Key: IMPALA-5345 URL: https://issues.apache.org/jira/browse/IMPALA-5345 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.10.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Critical Under stress conditions on two separate clusters (one secure, one not), I've seen some {{TransmitData()}} RPCs stay unresponded to forever, blocking the query's completion. The RPCs are seen by the recipient, but are not in the pending sender list. Need to test further to see if this is related to the fix for IMPALA-5093 or if a response is dropped on some path if an row batch is 'retried' from the pending sender list. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5138) Running 32 concurrent queries from TPC-DS Q31 caused a crash in "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...)
[ https://issues.apache.org/jira/browse/IMPALA-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5138. Resolution: Duplicate Fix Version/s: Impala 2.10.0 Seems to be the same root cause as IMPALA-5093. > Running 32 concurrent queries from TPC-DS Q31 caused a crash in > "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, > tuple=0x7f15aa008000, string_slots=...) buffered-tuple-stream.cc:840" > --- > > Key: IMPALA-5138 > URL: https://issues.apache.org/jira/browse/IMPALA-5138 > Project: IMPALA > Issue Type: Sub-task >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > 32 concurrent queries from TPC-DS Q31 against a 138 node cluster caused a > crash on the coordinator node > {code} > (gdb) bt > #0 0x0037dce32625 in raise () from /lib64/libc.so.6 > #1 0x0037dce33e05 in abort () from /lib64/libc.so.6 > #2 0x7f1b61179a55 in os::abort(bool) () from > /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so > #3 0x7f1b612f9f87 in VMError::report_and_die() () from > /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so > #4 0x7f1b6117e96f in JVM_handle_linux_signal () from > /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so > #5 > #6 0x0037dce89a97 in memcpy () from /lib64/libc.so.6 > #7 0x00f4f68c in impala::BufferedTupleStream::CopyStrings > (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:840 > #8 0x00f4fd75 in DeepCopyInternal (this=0x7f182c9b4440, > row=0x7f17d7ec4b08) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:815 > #9 impala::BufferedTupleStream::DeepCopy (this=0x7f182c9b4440, > row=0x7f17d7ec4b08) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:752 > #10 0x00e751af in AddRow (this=0x7f171fcdc1c0, stream=0x7f182c9b4440, > row=0x7f17d7ec4b08, status=0x7f1521a9e780) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.inline.h:30 > #11 impala::PhjBuilder::AppendRowStreamFull (this=0x7f171fcdc1c0, > stream=0x7f182c9b4440, row=0x7f17d7ec4b08, status=0x7f1521a9e780) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:281 > #12 0x7f18e5422316 in impala::PhjBuilder::ProcessBuildBatch () > #13 0x00e76351 in impala::PhjBuilder::Send (this=0x7f171fcdc1c0, > state=Unhandled dwarf expression opcode 0xf3 > ) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:174 > #14 0x00e5f673 in > impala::BlockingJoinNode::SendBuildInputToSink (this=0x7f1580c44700, > state=0x7f159c82f100, build_sink= > 0x7f171fcdc1c0) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:287 > #15 0x00e5def7 in impala::BlockingJoinNode::ProcessBuildInputAsync > (this=0x7f1580c44700, state=0x7f159c82f100, build_sink=0x7f171fcdc1c0, > status=0x7f1524ba0b50) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:154 > ---Type to continue, or q to quit--- > #16 0x00d59509 in operator() (name=Unhandled dwarf expression opcode > 0xf3 > ) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/function/function_template.hpp:767 > #17 impala::Thread::SuperviseThread (name=Unhandled dwarf expression opcode > 0xf3 > ) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/util/thread.cc:325 > #18 0x00d59f54 in operator()&, > const std::basic_string&, boost::function, impala::Promise int>*), boost::_bi::list0> (this=0x7f15aa6f3a00) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind.hpp:457 > #19 operator() (this=0x7f15aa6f3a00) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind_template.hpp:20 > #20 boost::detail::thread_data std::basic_string, std::allocator >&, > const std::basic_string, std::allocator > >&, boost::function, impala::Promise*), > boost::_bi::list4 std::char_traits, std::allocator > >, > boost::_bi::value, > std::allocator > >, boost
[jira] [Resolved] (IMPALA-5136) Running 48 concurrent Q17 queries against TPC-DS 1TB queries fail with Cannot process row that is bigger than the IO size (row_size=1.55 GB, null_indicators_size=0)
[ https://issues.apache.org/jira/browse/IMPALA-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5136. Resolution: Duplicate Fix Version/s: Impala 2.10.0 Seems to be same root cause as IMPALA-5093. > Running 48 concurrent Q17 queries against TPC-DS 1TB queries fail with Cannot > process row that is bigger than the IO size (row_size=1.55 GB, > null_indicators_size=0) > > > Key: IMPALA-5136 > URL: https://issues.apache.org/jira/browse/IMPALA-5136 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > Running 48 concurrent queries from TPC-DS Q17 against a 16 node cluster > queries failed with > Cannot process row that is bigger than the IO size (row_size=1.02 GB, > null_indicators_size=0). To run this query, increase the IO size (--read_size > option). > Cannot process row that is bigger than the IO size (row_size=1.55 GB, > null_indicators_size=0). To run this query, increase the IO size (--read_size > option). > Other iterations of the query failed with > {code} > Remote error: Service unavailable: ReportExecStatus request on > impala.ExecControlService from 10.17.193.20:55530 dropped due to > backpressure. The service queue is full; it has 1024 items. > Timed out: ReportExecStatus RPC to 10.17.193.10:22000 timed out after 10.000s > (SENT) > {code} > {code} > select i_item_id ,i_item_desc ,s_state ,count(ss_quantity) as > store_sales_quantitycount ,avg(ss_quantity) as store_sales_quantityave > ,stddev_samp(ss_quantity) as store_sales_quantitystdev > ,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov > ,count(sr_return_quantity) as store_returns_quantitycount > ,avg(sr_return_quantity) as store_returns_quantityave > ,stddev_samp(sr_return_quantity) as store_returns_quantitystdev > ,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as > store_returns_quantitycov ,count(cs_quantity) as catalog_sales_quantitycount > ,avg(cs_quantity) as catalog_sales_quantityave ,stddev_samp(cs_quantity) as > catalog_sales_quantitystdev ,stddev_samp(cs_quantity)/avg(cs_quantity) as > catalog_sales_quantitycov from store_sales ,store_returns ,catalog_sales > ,date_dim d1 ,date_dim d2 ,date_dim d3 ,store ,item where d1.d_quarter_name = > '2000Q1' and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and > s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = > sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = > d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3') and > sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and > cs_sold_date_sk = d3.d_date_sk and d3.d_quarter_name in > ('2000Q1','2000Q2','2000Q3') group by i_item_id ,i_item_desc ,s_state order > by i_item_id ,i_item_desc ,s_state limit 100 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5093) Rare failure to decode LZ4 batch
[ https://issues.apache.org/jira/browse/IMPALA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5093. Resolution: Fixed Fix Version/s: Impala 2.10.0 We tracked this down to a lifecycle problem: outgoing sidecars were destroyed by {{Close()}} before the RPC layer had a chance to finish sending them. The fix (for now, while we work on the larger issue of buffer lifetimes in KUDU-2011) is to share ownership of the buffer through the {{RpcSidecar}}. With this fix, we were able to run a stress test on 7 nodes for over 24 hours with no crashes, where before the test would fail within a few minutes. > Rare failure to decode LZ4 batch > > > Key: IMPALA-5093 > URL: https://issues.apache.org/jira/browse/IMPALA-5093 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Critical > Fix For: Impala 2.10.0 > > > KRPC sometimes hits this {{DCHECK}} > https://github.com/henryr/Impala/blob/krpc/be/src/runtime/row-batch.cc#L108 > which indicates that {{Lz4Compress::ProcessBlock}} has failed to decompress > the incoming row batch. Not much clarity about how this happens yet. > Stack trace: > {code} > 6 0x02c7598e in google::LogMessageFatal::~LogMessageFatal() () > #7 0x017914ba in impala::RowBatch::RowBatch (this=0x3d8af3c0, > row_desc=..., input_batch=..., mem_tracker=0x13ad1c80) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/row-batch.cc:108 > #8 0x0174c655 in impala::DataStreamRecvr::SenderQueue::AddBatch > (this=0xc962800, payload=...) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-recvr.cc:210 > #9 0x0174e13a in impala::DataStreamRecvr::AddBatch (this=0xcdda580, > payload=...) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-recvr.cc:352 > #10 0x0173f076 in impala::DataStreamMgr::AddData (this=0xe4a0b20, > fragment_instance_id=..., payload=...) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-mgr.cc:190 > #11 0x018e8c63 in impala::DataStreamService::TransmitData > (this=0xdb357c0, request=0x4338c3f0, response=0xd802c00, context=0x11d27b60) > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-internal-service.cc:77 > #12 0x018ed74e in > _ZZN6impala19DataStreamServiceIfC4ERK13scoped_refptrIN4kudu12MetricEntityEERKS1_INS2_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE0_clESG_SH_SJ_ > () > at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala_internal_service.service.cc:157 > #13 0x018eff3b in std::_Function_handler google::protobuf::Message*, google::protobuf::Message*, > kudu::rpc::RpcContext*), > impala::DataStreamServiceIf::DataStreamServiceIf(const > scoped_refptr&, const > scoped_refptr&):: google::protobuf::Message*, google::protobuf::Message*, > kudu::rpc::RpcContext*)> >::_M_invoke(const std::_Any_data &, const > google::protobuf::Message *, google::protobuf::Message *, > kudu::rpc::RpcContext *) (__functor=..., __args#0=0x4338c3f0, > __args#1=0xd802c00, __args#2=0x11d27b60) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/gcc-4.9.2/include/c++/4.9.2/functional:2039 > #14 0x01d9fcb4 in std::function google::protobuf::Message*, google::protobuf::Message*, > kudu::rpc::RpcContext*)>::operator()(const google::protobuf::Message *, > google::protobuf::Message *, kudu::rpc::RpcContext *) const (this=0xeb320b8, > __args#0=0x4338c3f0, __args#1=0xd802c00, __args#2=0x11d27b60) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/gcc-4.9.2/include/c++/4.9.2/functional:2439 > #15 0x01d9f6b7 in kudu::rpc::GeneratedServiceIf::Handle > (this=0xdb357c0, call=0xcf37480) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/kudu/rpc/service_if.cc:134 > #16 0x016abfb8 in impala::ImpalaServicePool::RunThread > (this=0xe85ac80) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/impala-service-pool.cc:130 > #17 0x016ab5db in > impala::ImpalaServicePooloperator()(void) const > (__closure=0x7f5e11a86be8) at > /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/impala-service-pool.cc:68 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5174) Suppress kudu flags that aren't relevant to Impala
[ https://issues.apache.org/jira/browse/IMPALA-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5174. Resolution: Fixed Fix Version/s: Impala 2.9.0 Fixed in https://github.com/apache/incubator-impala/commit/d1910a39fcc50ce211b95c3552c0c90b4bc37bbd which brings in this gflags commit: https://github.com/henryr/gflags/commit/9ae8eae9a1b6162026854a5266d4ee1427c6d168 > Suppress kudu flags that aren't relevant to Impala > -- > > Key: IMPALA-5174 > URL: https://issues.apache.org/jira/browse/IMPALA-5174 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.9.0 > > > Kudu's util libraries declare quite a few flags, some of which are irrelevant > to Impala (as they exist in code that isn't actually used). If possible, we > should figure out a way to suppress them from {{--help}} and {{/varz}} to > avoid user confusion. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-5228) test_coordinators custom cluster test fails after rebase
[ https://issues.apache.org/jira/browse/IMPALA-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-5228. Resolution: Fixed This is fixed in the latest KRPC test runs. > test_coordinators custom cluster test fails after rebase > > > Key: IMPALA-5228 > URL: https://issues.apache.org/jira/browse/IMPALA-5228 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.9.0 > > > Need to fix {{test_coordinators}} after rebase of KRPC on top of IMPALA-4041. > {code} > 23:13:29 === FAILURES > === > 23:13:29 _ TestCoordinators.test_multiple_coordinators > __ > 23:13:29 > 23:13:29 self = > 23:13:29 > 23:13:29 @pytest.mark.execute_serially > 23:13:29 def test_multiple_coordinators(self): > 23:13:29 """Test a cluster configuration in which not all impalad nodes > are coordinators. > 23:13:29 Verify that only coordinators can accept client > connections and that select and DDL > 23:13:29 queries run successfully.""" > 23:13:29 > 23:13:29 db_name = "TEST_MUL_COORD_DB" > 23:13:29 > self._start_impala_cluster([], num_coordinators=2, > cluster_size=3) > 23:13:29 > 23:13:29 custom_cluster/test_coordinators.py:32: > 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ > 23:13:29 common/custom_cluster_test_suite.py:119: in _start_impala_cluster > 23:13:29 check_call(cmd + options, close_fds=True) > 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ > 23:13:29 > 23:13:29 popenargs = > (['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', > > '--cluster_size=3..._dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', > '--log_level=1'],) > 23:13:29 kwargs = {'close_fds': True}, retcode = 1 > 23:13:29 cmd = > ['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', > > '--cluster_size=3'...og_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', > '--log_level=1'] > 23:13:29 > 23:13:29 def check_call(*popenargs, **kwargs): > 23:13:29 """Run command with arguments. Wait for command to > complete. If > 23:13:29 the exit code was zero then return, otherwise raise > 23:13:29 CalledProcessError. The CalledProcessError object will have > the > 23:13:29 return code in the returncode attribute. > 23:13:29 > 23:13:29 The arguments are the same as for the Popen constructor. > Example: > 23:13:29 > 23:13:29 check_call(["ls", "-l"]) > 23:13:29 """ > 23:13:29 retcode = call(*popenargs, **kwargs) > 23:13:29 cmd = kwargs.get("args") > 23:13:29 if cmd is None: > 23:13:29 cmd = popenargs[0] > 23:13:29 if retcode: > 23:13:29 > raise CalledProcessError(retcode, cmd) > 23:13:29 E CalledProcessError: Command > '['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', > '--cluster_size=3', '--num_coordinators=2', > '--log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', > '--log_level=1']' returned non-zero exit status 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5228) test_coordinators custom cluster test fails after rebase
Henry Robinson created IMPALA-5228: -- Summary: test_coordinators custom cluster test fails after rebase Key: IMPALA-5228 URL: https://issues.apache.org/jira/browse/IMPALA-5228 Project: IMPALA Issue Type: Sub-task Components: Distributed Exec Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: Impala 2.9.0 Need to fix {{test_coordinators}} after rebase of KRPC on top of IMPALA-4041. {code} 23:13:29 === FAILURES === 23:13:29 _ TestCoordinators.test_multiple_coordinators __ 23:13:29 23:13:29 self = 23:13:29 23:13:29 @pytest.mark.execute_serially 23:13:29 def test_multiple_coordinators(self): 23:13:29 """Test a cluster configuration in which not all impalad nodes are coordinators. 23:13:29 Verify that only coordinators can accept client connections and that select and DDL 23:13:29 queries run successfully.""" 23:13:29 23:13:29 db_name = "TEST_MUL_COORD_DB" 23:13:29 > self._start_impala_cluster([], num_coordinators=2, cluster_size=3) 23:13:29 23:13:29 custom_cluster/test_coordinators.py:32: 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 23:13:29 common/custom_cluster_test_suite.py:119: in _start_impala_cluster 23:13:29 check_call(cmd + options, close_fds=True) 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 23:13:29 23:13:29 popenargs = (['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--cluster_size=3..._dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', '--log_level=1'],) 23:13:29 kwargs = {'close_fds': True}, retcode = 1 23:13:29 cmd = ['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--cluster_size=3'...og_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', '--log_level=1'] 23:13:29 23:13:29 def check_call(*popenargs, **kwargs): 23:13:29 """Run command with arguments. Wait for command to complete. If 23:13:29 the exit code was zero then return, otherwise raise 23:13:29 CalledProcessError. The CalledProcessError object will have the 23:13:29 return code in the returncode attribute. 23:13:29 23:13:29 The arguments are the same as for the Popen constructor. Example: 23:13:29 23:13:29 check_call(["ls", "-l"]) 23:13:29 """ 23:13:29 retcode = call(*popenargs, **kwargs) 23:13:29 cmd = kwargs.get("args") 23:13:29 if cmd is None: 23:13:29 cmd = popenargs[0] 23:13:29 if retcode: 23:13:29 > raise CalledProcessError(retcode, cmd) 23:13:29 E CalledProcessError: Command '['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py', '--cluster_size=3', '--num_coordinators=2', '--log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests', '--log_level=1']' returned non-zero exit status 1 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-3955) Remove Scheduler class and rename SimpleScheduler to Scheduler
[ https://issues.apache.org/jira/browse/IMPALA-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-3955. Resolution: Fixed Fix Version/s: Impala 2.9.0 Fixed by https://github.com/apache/incubator-impala/commit/4743342da1147b09b6bc6cf0322f99026c300952 > Remove Scheduler class and rename SimpleScheduler to Scheduler > -- > > Key: IMPALA-3955 > URL: https://issues.apache.org/jira/browse/IMPALA-3955 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Labels: newbie > Fix For: Impala 2.9.0 > > > Just for code cleanliness, it would be good to get rid of the {{Scheduler}} > interface class. There's only one implementation of the interface > ({{SimpleScheduler}}), and there only ever has been one. If we ever feel it's > necessary to have more than one scheduler implementation, we can introduce an > appropriate abstraction at that point. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IMPALA-5174) Suppress kudu flags that aren't relevant to Impala
Henry Robinson created IMPALA-5174: -- Summary: Suppress kudu flags that aren't relevant to Impala Key: IMPALA-5174 URL: https://issues.apache.org/jira/browse/IMPALA-5174 Project: IMPALA Issue Type: Sub-task Components: Backend Affects Versions: Impala 2.9.0 Reporter: Henry Robinson Assignee: Henry Robinson Kudu's util libraries declare quite a few flags, some of which are irrelevant to Impala (as they exist in code that isn't actually used). If possible, we should figure out a way to suppress them from {{--help}} and {{/varz}} to avoid user confusion. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (IMPALA-4758) Upgrade gutil to recent Kudu version
[ https://issues.apache.org/jira/browse/IMPALA-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4758. Resolution: Fixed Fix Version/s: Impala 2.9.0 Fixed by two commits: https://github.com/apache/incubator-impala/commit/02f3e3fcc1c58bcaf5080ddee939c9081412a553 https://github.com/apache/incubator-impala/commit/23100102c0a9a8f3a8a7ff069cbfaa7a56628238 > Upgrade gutil to recent Kudu version > > > Key: IMPALA-4758 > URL: https://issues.apache.org/jira/browse/IMPALA-4758 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.9.0 >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: Impala 2.9.0 > > > The gutil library that we share with Kudu has changed a bit, and needs to be > updated before we import the KRPC / util libraries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)