[jira] [Resolved] (IMPALA-5816) ssl-related custom cluster tests failing during setup on exhaustive RHEL7

2017-09-01 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5816.

   Resolution: Fixed
Fix Version/s: Impala 2.11.0

https://github.com/apache/incubator-impala/commit/c163ac1468e4d878c3516ec933c69fb66851af01

> ssl-related custom cluster tests failing during setup on exhaustive RHEL7
> -
>
> Key: IMPALA-5816
> URL: https://issues.apache.org/jira/browse/IMPALA-5816
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.10.0
>Reporter: David Knupp
>Assignee: Henry Robinson
>Priority: Critical
> Fix For: Impala 2.11.0
>
>
> Tests that were seen to fail include:
> * TestClientSsl.test_tls_v12
> * TestClientSsl.test_wildcard_ssl
> * TestClientSsl.test_wildcard_san_ssl
> Example stack trace: 
> {noformat}
> self = 
> method =  >
> def setup_method(self, method):
>   cluster_args = list()
>   for arg in [IMPALAD_ARGS, STATESTORED_ARGS, CATALOGD_ARGS]:
> if arg in method.func_dict:
>   cluster_args.append("--%s=\"%s\" " % (arg, method.func_dict[arg]))
>   # Start a clean new cluster before each test
> > self._start_impala_cluster(cluster_args)
> common/custom_cluster_test_suite.py:103: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> common/custom_cluster_test_suite.py:129: in _start_impala_cluster
> check_call(cmd + options, close_fds=True)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> popenargs = 
> (['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py',
>  
> '--cluster_si...cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem
>  --ssl_cipher_list=AES128-GCM-SHA256 " ', ...],)
> kwargs = {'close_fds': True}, retcode = 1
> cmd = 
> ['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py',
>  
> '--cluster_siz...a-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem
>  --ssl_cipher_list=AES128-GCM-SHA256 " ', ...]
> def check_call(*popenargs, **kwargs):
> """Run command with arguments.  Wait for command to complete.  If
> the exit code was zero then return, otherwise raise
> CalledProcessError.  The CalledProcessError object will have the
> return code in the returncode attribute.
> 
> The arguments are the same as for the Popen constructor.  Example:
> 
> check_call(["ls", "-l"])
> """
> retcode = call(*popenargs, **kwargs)
> if retcode:
> cmd = kwargs.get("args")
> if cmd is None:
> cmd = popenargs[0]
> >   raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 
> '['/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/bin/start-impala-cluster.py',
>  '--cluster_size=3', '--num_coordinators=3', 
> '--log_dir=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1', 
> '--impalad_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem
>  
> --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key
>  --ssl_minimum_version=tlsv1.2 
> --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem
>  --ssl_cipher_list=AES128-GCM-SHA256 " ', 
> '--state_store_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem
>  
> --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key
>  --ssl_minimum_version=tlsv1.2 
> --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem
>  --ssl_cipher_list=AES128-GCM-SHA256 " ', 
> '--catalogd_args="--ssl_server_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.pem
>  
> --ssl_private_key=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcard-cert.key
>  --ssl_minimum_version=tlsv1.2 
> --ssl_client_ca_certificate=/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/testutil/wildcardCA.pem
>  --ssl_cipher_list=AES128-GCM-SHA256 " ']' returned non-zero exit status 1
> {noformat}
> Standard error output:
> {noformat}
> MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> Mai

[jira] [Created] (IMPALA-5887) Hung union query

2017-09-01 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5887:
--

 Summary: Hung union query
 Key: IMPALA-5887
 URL: https://issues.apache.org/jira/browse/IMPALA-5887
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Priority: Critical


During an exhaustive test run on CentOS 7.0, I noticed the following query hung 
for 2.5 hours:

{code}
select count(c) from ( select bigint_col + 1 as c from functional.alltypes 
limit 15 union all select bigint_col as c from functional.alltypes limit 15 
union all select bigint_col + 1 as c from functional.alltypes limit 15 union 
all (select bigint_col as c from functional.alltypes limit 15)) t
{code}

There was one fragment instance still running which was waiting on some hdfs 
scanner threads to complete. Unfortunately taking the {{pstack}} caused the 
threads to unblock themselves, and the query completed.

{code}
Thread 1 (process 18704):
#0  0x7f47592a9705 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x013648e5 in boost::condition_variable::wait (this=0x1b303c858, 
m=...) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/pthread/condition_variable.hpp:73
#2  0x01d89d8c in boost::thread::join_noexcept() ()
#3  0x015148ab in boost::thread::join (this=0x181d458f0) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:767
#4  0x01514f3e in impala::Thread::Join (this=0x1a983220) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.h:106
#5  0x017fa67e in impala::ThreadGroup::JoinAll (this=0x181d856c8) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:338
#6  0x0189be87 in impala::HdfsScanNode::Close (this=0x181d85000, 
state=0xbf38800) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:236
#7  0x01a515e1 in impala::UnionNode::GetNextMaterialized 
(this=0x171f5680, state=0xbf38800, row_batch=0x7f469d2561f0) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/union-node.cc:242
#8  0x01a5272b in impala::UnionNode::GetNext (this=0x171f5680, 
state=0xbf38800, row_batch=0x7f469d2561f0, eos=0x7f469d25639f) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/union-node.cc:297
#9  0x019e4f09 in impala::PartitionedAggregationNode::Open 
(this=0x25251900, state=0xbf38800) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/exec/partitioned-aggregation-node.cc:302
#10 0x015f74ab in impala::FragmentInstanceState::Open (this=0x150b3480) 
at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/fragment-instance-state.cc:256
#11 0x015f4fd7 in impala::FragmentInstanceState::Exec (this=0x150b3480) 
at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/fragment-instance-state.cc:80
#12 0x015bb5d2 in impala::QueryState::ExecFInstance (this=0x16cad600, 
fis=0x150b3480) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/query-state.cc:351
#13 0x015ba102 in impala::QueryStateoperator()(void) 
const (__closure=0x7f469d256c28) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/runtime/query-state.cc:319
#14 0x015bc283 in 
boost::detail::function::void_function_obj_invoker0,
 void>::invoke(boost::detail::function::function_buffer &) 
(function_obj_ptr=...) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#15 0x0152d542 in boost::function0::operator() 
(this=0x7f469d256c20) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#16 0x017fa4eb in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function, impala::Promise*) 
(name=..., category=..., functor=..., thread_started=0x7f468003ec40) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:329
#17 0x01802e26 in boost::_bi::list4, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value*> >::operator(), impala::Promise*), 
boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
std::string const&, boost::function, impala::Promise*), 
boost::_bi::list0&, int) (this=0x13f7cdfc0, f=@0x13f7cdfb8: 0x17fa1cc 
, impala::Promise*)>, a=...) at 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boo

[jira] [Resolved] (IMPALA-5846) Kudu libraries are written to be/src/.., not be/build/...

2017-08-26 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5846.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/f20b1626b8bdf2a87e089cb18f82cd80a7cc981c

> Kudu libraries are written to be/src/.., not be/build/...
> -
>
> Key: IMPALA-5846
> URL: https://issues.apache.org/jira/browse/IMPALA-5846
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Any library built using {{ADD_EXPORTABLE_LIBRARY}} puts the library or 
> archive in the source directory it's built from, not in 
> {{be/build//...}}. This isn't great for isolating building different 
> targets, nor is it consistent with the rest of the build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-4669) Add Kudu's RPC, util and security libraries

2017-08-25 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4669.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Finally, here's the RPC library:

https://github.com/apache/incubator-impala/commit/c7db60aa46565c19634e8a791df3af8d116b9017
https://github.com/apache/incubator-impala/commit/113526198051f6810c84df513d507074856f5e4c

> Add Kudu's RPC, util and security libraries
> ---
>
> Key: IMPALA-4669
> URL: https://issues.apache.org/jira/browse/IMPALA-4669
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, 
> {{security}} and {{util}} libraries. The easiest way for now is to pull them 
> into trunk. 
> Doing this also requires upgrading our {{gutil}} version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5849) Don't disable TLS configuration at compile-time even with OpenSSL 1.0.0

2017-08-25 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5849:
--

 Summary: Don't disable TLS configuration at compile-time even with 
OpenSSL 1.0.0
 Key: IMPALA-5849
 URL: https://issues.apache.org/jira/browse/IMPALA-5849
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


IMPALA-5800, IMPALA-5775 and IMPALA-5743 added TLS configuration to Impala and 
Squeasel. Since Impala is often built against different versions of OpenSSL 
(with different TLS capabilities), we used compile-time definitions to avoid 
using symbols from OpenSSL 1.0.1 that weren't available. 

This works great if we can ensure that the machine on which Impala is built is 
the same environment as the one on which it executes, but we have discovered 
that the installed version of OpenSSL can vary between minor releases of Linux 
distributions.

It appears possible to write the support for TLS1.1+ in terms of symbols that 
are available in OpenSSL 1.0.0 only. The only downside is that Impala can't 
then tell whether or not the runtime supports TLS 1.2, and so the error 
messages won't be quite as clear. However, the benefit of a single binary and 
Thrift toolchain dependency for all supported versions of OpenSSL is well worth 
it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5846) Kudu libraries are written to be/src/.., not be/build/...

2017-08-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5846:
--

 Summary: Kudu libraries are written to be/src/.., not be/build/...
 Key: IMPALA-5846
 URL: https://issues.apache.org/jira/browse/IMPALA-5846
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Any library built using {{ADD_EXPORTABLE_LIBRARY}} puts the library or archive 
in the source directory it's built from, not in {{be/build//...}}. This 
isn't great for isolating building different targets, nor is it consistent with 
the rest of the build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5825) TSSLSocket factory may throw uncaught exception

2017-08-22 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5825.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Fixed in 
https://github.com/apache/incubator-impala/commit/f9b222e9229ef3830f00b0e47073d7a8880e2bfb

> TSSLSocket factory may throw uncaught exception
> ---
>
> Key: IMPALA-5825
> URL: https://issues.apache.org/jira/browse/IMPALA-5825
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> If using TLS, Thrift's {{TSSLSocketFactory}} constructor might throw an 
> exception if there was an error initializing the SSL context.
> We don't currently catch that error, meaning that a misconfiguration leads to 
> an unexpected process death. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5825) TSSLSocket factory may throw uncaught exception

2017-08-21 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5825:
--

 Summary: TSSLSocket factory may throw uncaught exception
 Key: IMPALA-5825
 URL: https://issues.apache.org/jira/browse/IMPALA-5825
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.8.0, Impala 2.7.0, Impala 2.9.0, Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


If using TLS, Thrift's {{TSSLSocketFactory}} constructor might throw an 
exception if there was an error initializing the SSL context.

We don't currently catch that error, meaning that a misconfiguration leads to 
an unexpected process death. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5811) Add per-query backends page

2017-08-17 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5811:
--

 Summary: Add per-query backends page
 Key: IMPALA-5811
 URL: https://issues.apache.org/jira/browse/IMPALA-5811
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


It's useful when diagnosing hangs, etc, to see a quick overview of which 
backends have fragment instances that are still running, and whether they're 
reporting to the coordinator in a timely manner. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-4666) Remove thirdparty from search dir for toolchain deps

2017-08-17 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4666.

Resolution: Fixed

Not sure exactly what I meant here, but I think it's been fixed with the recent 
shared linking improvements.

> Remove thirdparty from search dir for toolchain deps
> 
>
> Key: IMPALA-4666
> URL: https://issues.apache.org/jira/browse/IMPALA-4666
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
>
> Lots of the {{Find*.cmake}} modules look in {{thirdparty/}} still, and 
> shouldn't.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5800) Configure Squeasel's TLS version / ciphers

2017-08-16 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5800.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/51ec60713980bd6e64e626f4476e843c49f5ea48

> Configure Squeasel's TLS version / ciphers
> --
>
> Key: IMPALA-5800
> URL: https://issues.apache.org/jira/browse/IMPALA-5800
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Squeasel will be getting TLS cipher suite and version configuration after 
> this [pull request|https://github.com/cloudera/squeasel/pull/6] is merged.
> We should import that change, then plumb through the relevant configuration 
> options to the Squeasel instance from Impala.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5775) Impala shell only supports TLSv1

2017-08-16 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5775.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/e4a0e2f391bce3b8411ce7e5010856a54dc52991

> Impala shell only supports TLSv1
> 
>
> Key: IMPALA-5775
> URL: https://issues.apache.org/jira/browse/IMPALA-5775
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Per https://docs.python.org/2/library/ssl.html, we have Impala shell's SSL 
> client configured only to connect using TLSv1. That is, if after IMPALA-5743, 
> it tries to connect to a TLSv1_2 server, it won't work.
> We should change the client protocol to {{SSLv23}} (I think this is 
> acceptable for a client - the server won't negotiate an SSL connection), 
> which can connect to all flavours of TLS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5109) Increase plan fragment startup histogram max latency to > 20000ms

2017-08-15 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5109.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/6a606ed459c173b50af7b1bd922970ac57fd17fc

> Increase plan fragment startup histogram max latency to > 2ms
> -
>
> Key: IMPALA-5109
> URL: https://issues.apache.org/jira/browse/IMPALA-5109
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> We track plan fragment start latencies, but max out at 20s in the histogram. 
> We should probably set that to 30 minutes or so to capture really long RPC 
> delays.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5743) Allow for configuration of TLS / SSL versions

2017-08-15 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5743.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Fixed in 
https://github.com/apache/incubator-impala/commit/16ce201f5250451cb55e2cb9821b5d628d777160.

Note that this requires [this toolchain 
commit|https://github.com/cloudera/native-toolchain/commit/fc9954b4fab21d31d5c4b99b1f64545d2c70f65b]
 to add TLS configuration to Thrift 0.9.0.

> Allow for configuration of TLS / SSL versions
> -
>
> Key: IMPALA-5743
> URL: https://issues.apache.org/jira/browse/IMPALA-5743
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> It would be good for users to be able, via the command line, to specify 
> acceptable TLS protocols.
> Users will typically want to specify a minimum protocol version (i.e. TLS1.0, 
> 1.1 or 1.2), rather than a specific protocol version. Kudu has 
> {{--rpc_tls_minimum_version}}, and we can follow their lead. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5526) Add krb5 to toolchain

2017-08-14 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5526.

Resolution: Won't Fix

We eventually decided against using krb5 in the toolchain, vs making it a 
system-level pre-requisite.

> Add krb5 to toolchain
> -
>
> Key: IMPALA-5526
> URL: https://issues.apache.org/jira/browse/IMPALA-5526
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>
> KRPC adds a compile-time dependency on libkrb5's headers. To guarantee that 
> they're available in all build environments, we should add krb5 (from 
> http://web.mit.edu/kerberos/dist/index.html) to the toolchain.
> libkrb5.so should be dynamically linked by default, to avoid creating a 
> binary that has statically linked security dependencies (this is an issue for 
> us at Cloudera as a vendor, but also a general antipattern). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5800) Configure Squeasel's TLS version / ciphers

2017-08-14 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5800:
--

 Summary: Configure Squeasel's TLS version / ciphers
 Key: IMPALA-5800
 URL: https://issues.apache.org/jira/browse/IMPALA-5800
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Squeasel will be getting TLS cipher suite and version configuration after this 
[pull request|https://github.com/cloudera/squeasel/pull/6] is merged.

We should import that change, then plumb through the relevant configuration 
options to the Squeasel instance from Impala.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5666) Use manual poisoning for ASAN with new buffer pool

2017-08-14 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5666.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/a99114283b371852254fe05eb24ac0e339cf777b

> Use manual poisoning for ASAN with new buffer pool
> --
>
> Key: IMPALA-5666
> URL: https://issues.apache.org/jira/browse/IMPALA-5666
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Tim Armstrong
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> We should use 
> https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning for 
> the to catch bugs where memory buffers are accessed after they are freed. We 
> should do this for MemPools and the BufferPool to start off with and maybe 
> for DiskIoMgr buffers and FreePool allocations.
> We can already catch this with --disable_mem_pools but it would be good to 
> have stricter checks enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5773) Memory limit exceeded on test_spilling.py

2017-08-14 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5773.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/f2f52a8e1ce9560329566ee71945b3901a1ef958

> Memory limit exceeded on test_spilling.py
> -
>
> Key: IMPALA-5773
> URL: https://issues.apache.org/jira/browse/IMPALA-5773
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: Michael Ho
>Assignee: Henry Robinson
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.10.0
>
>
> {noformat}
> 03:55:47 FAIL 
> query_test/test_spilling.py::TestSpilling::()::test_spilling[exec_option: 
> {'default_spillable_buffer_size': '256k'} | table_format: parquet/none]
> 03:55:47 === FAILURES 
> ===
> 03:55:47  TestSpilling.test_spilling[exec_option: 
> {'default_spillable_buffer_size': '256k'} | table_format: parquet/none] 
> 03:55:47 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python
> 03:55:47 query_test/test_spilling.py:39: in test_spilling
> 03:55:47 self.run_test_case('QueryTest/spilling', vector)
> 03:55:47 common/impala_test_suite.py:390: in run_test_case
> 03:55:47 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 03:55:47 common/impala_test_suite.py:598: in __execute_query
> 03:55:47 return impalad_client.execute(query, user=user)
> 03:55:47 common/impala_connection.py:160: in execute
> 03:55:47 return self.__beeswax_client.execute(sql_stmt, user=user)
> 03:55:47 beeswax/impala_beeswax.py:173: in execute
> 03:55:47 handle = self.__execute_query(query_string.strip(), user=user)
> 03:55:47 beeswax/impala_beeswax.py:339: in __execute_query
> 03:55:47 self.wait_for_completion(handle)
> 03:55:47 beeswax/impala_beeswax.py:359: in wait_for_completion
> 03:55:47 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 03:55:47 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 03:55:47 EQuery aborted:Memory limit exceeded: Error occurred on backend 
> impala-boost-static-burst-slave-1c37.vpc.cloudera.com:22000 by fragment 
> be415e2081bde55d:ce5cb0b40001
> 03:55:47 E   Memory left in process limit: 49.35 GB
> 03:55:47 E   Memory left in query limit: -36546.00 B
> 03:55:47 E   Query(be415e2081bde55d:ce5cb0b4): memory limit exceeded. 
> Limit=800.00 MB Reservation=640.00 MB ReservationLimit=640.00 MB 
> OtherMemory=160.03 MB Total=800.03 MB Peak=800.03 MB
> 03:55:47 E Fragment be415e2081bde55d:ce5cb0b4: Reservation=0 
> OtherMemory=12.24 KB Total=12.24 KB Peak=63.50 KB
> 03:55:47 E   AGGREGATION_NODE (id=6): Total=4.00 KB Peak=4.00 KB
> 03:55:47 E Exprs: Total=4.00 KB Peak=4.00 KB
> 03:55:47 E   EXCHANGE_NODE (id=5): Total=0 Peak=0
> 03:55:47 E   DataStreamRecvr: Total=0 Peak=0
> 03:55:47 E   PLAN_ROOT_SINK: Total=0 Peak=0
> 03:55:47 E   CodeGen: Total=247.00 B Peak=51.50 KB
> 03:55:47 E Fragment be415e2081bde55d:ce5cb0b40002: Reservation=24.50 
> MB OtherMemory=157.98 MB Total=182.48 MB Peak=182.48 MB
> 03:55:47 E   AGGREGATION_NODE (id=2): Total=4.00 KB Peak=4.00 KB
> 03:55:47 E Exprs: Total=4.00 KB Peak=4.00 KB
> 03:55:47 E   AGGREGATION_NODE (id=4): Reservation=24.50 MB 
> OtherMemory=29.12 KB Total=24.53 MB Peak=25.80 MB
> 03:55:47 E Exprs: Total=21.12 KB Peak=21.12 KB
> 03:55:47 E   EXCHANGE_NODE (id=3): Total=0 Peak=0
> 03:55:47 E   DataStreamRecvr: Total=157.20 MB Peak=157.20 MB
> 03:55:47 E   DataStreamSender (dst_id=5): Total=16.00 KB Peak=16.00 KB
> 03:55:47 E   CodeGen: Total=5.09 KB Peak=373.50 KB
> 03:55:47 E Fragment be415e2081bde55d:ce5cb0b40001: Reservation=615.50 
> MB OtherMemory=2.04 MB Total=617.54 MB Peak=651.88 MB
> 03:55:47 E   AGGREGATION_NODE (id=1): Reservation=615.50 MB 
> OtherMemory=2.02 MB Total=617.52 MB Peak=618.70 MB
> 03:55:47 E Exprs: Total=2.02 MB Peak=2.02 MB
> 03:55:47 E   HDFS_SCAN_NODE (id=0): Total=0 Peak=33.32 MB
> 03:55:47 E   DataStreamSender (dst_id=3): Total=7.52 KB Peak=7.52 KB
> 03:55:47 E   CodeGen: Total=6.84 KB Peak=522.50 KB
> 03:55:47 E   
> 03:55:47 E   Memory limit exceeded: Error occurred on backend 
> impala-boost-static-burst-slave-1c37.vpc.cloudera.com:22000 by fragment 
> be415e2081bde55d:ce5cb0b40001
> 03:55:47 E   Memory left in process limit: 49.35 GB
> 03:55:47 E   Memory left in query limit: -36546.00 B
> 03:55:47 E   Query(be415e2081bde55d:ce5cb0b4): memory limit exceeded. 
> Limit=800.00 M

[jira] [Resolved] (IMPALA-5781) thrift-server-test failed

2017-08-10 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5781.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/cfcbfab4ff6df0092e68b169c46958467fc0ec14

> thrift-server-test failed
> -
>
> Key: IMPALA-5781
> URL: https://issues.apache.org/jira/browse/IMPALA-5781
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Michael Ho
>Assignee: Henry Robinson
>Priority: Blocker
> Fix For: Impala 2.10.0
>
>
> [~henryr], can you please take a first look ? This test was touched by a 
> recent 
> [commit|https://github.com/apache/incubator-impala/commit/68df21b426feca8e7a458152d8dca1b7e1335bcb]
>  of yours
> {noformat}
> 15:39:18 49/86 Test #49: thrift-server-test ...***Exception: 
> SegFault  1.15 sec
> 15:39:18 Turning perftools heap leak checking off
> 15:39:18 [==] Running 12 tests from 4 test cases.
> 15:39:18 [--] Global test environment set-up.
> 15:39:18 [--] 1 test from ThriftServer
> 15:39:18 [ RUN  ] ThriftServer.Connectivity
> 15:39:18 [   OK ] ThriftServer.Connectivity (43 ms)
> 15:39:18 [--] 1 test from ThriftServer (43 ms total)
> 15:39:18 
> 15:39:18 [--] 7 tests from SslTest
> 15:39:18 [ RUN  ] SslTest.Connectivity
> 15:39:18 [   OK ] SslTest.Connectivity (13 ms)
> 15:39:18 [ RUN  ] SslTest.BadCertificate
> 15:39:18 [   OK ] SslTest.BadCertificate (3 ms)
> 15:39:18 [ RUN  ] SslTest.ClientBeforeServer
> 15:39:18 [   OK ] SslTest.ClientBeforeServer (7 ms)
> 15:39:18 [ RUN  ] SslTest.BadCiphers
> 15:39:18 [   OK ] SslTest.BadCiphers (5 ms)
> 15:39:18 [ RUN  ] SslTest.MismatchedCiphers
> 15:39:18 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:238:
>  Failure
> 15:39:18 Value of: status_.ok()
> 15:39:18   Actual: false
> 15:39:18 Expected: true
> 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no 
> cipher match
> 15:39:18 
> 15:39:18 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:246:
>  Failure
> 15:39:18 Value of: status_.ok()
> 15:39:18   Actual: false
> 15:39:18 Expected: true
> 15:39:18 Error: Couldn't open transport for localhost:57370 (connect() 
> failed: Connection refused)
> 15:39:18 
> 15:39:18 [  FAILED  ] SslTest.MismatchedCiphers (12 ms)
> 15:39:18 [ RUN  ] SslTest.MatchedCiphers
> 15:39:18 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:263:
>  Failure
> 15:39:18 Value of: status_.ok()
> 15:39:18   Actual: false
> 15:39:18 Expected: true
> 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no 
> cipher match
> 15:39:18 
> 15:39:18 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/be/src/rpc/thrift-server-test.cc:270:
>  Failure
> 15:39:18 Value of: status_.ok()
> 15:39:18   Actual: false
> 15:39:18 Expected: true
> 15:39:18 Error: SSL socket creation failed: SSL_CTX_set_cipher_list: no 
> cipher match
> 15:39:18 
> 15:39:18 Wrote minidump to 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/logs/be_tests/minidumps/thrift-server-test/7d6ed95d-b688-43c7-b5af0284-7431e2f5.dmp
> 15:39:18 Wrote minidump to 
> /data/jenkins/workspace/impala-cdh5-trunk-core-data-load/repos/Impala/logs/be_tests/minidumps/thrift-server-test/7d6ed95d-b688-43c7-b5af0284-7431e2f5.dmp
> 15:39:18 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5785) Purge local connection pool if node statestore marks node offline

2017-08-09 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5785.

   Resolution: Fixed
Fix Version/s: (was: Impala 2.10.0)
   Impala 2.0

This already happens for nodes that have ever run a query. See 
https://github.com/apache/incubator-impala/blob/master/be/src/service/impala-server.cc#L1573.

My understanding is that is sufficient - if you've seen a bug that's 
attributable to this not working, please attach some more information! We don't 
have a lot of testing for this path.

> Purge local connection pool if node statestore marks node offline
> -
>
> Key: IMPALA-5785
> URL: https://issues.apache.org/jira/browse/IMPALA-5785
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Lars Volker
>Priority: Critical
> Fix For: Impala 2.0
>
>
> From time to time there seem to be issues with stale connection pool entries 
> when nodes restart. In cases where the backend receives an update from the 
> statestore that a node has gone offline, we should remove connections to that 
> node from the connection pool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5696) Enable cipher configuration when using TLS w/Thrift

2017-08-08 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5696.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/68df21b426feca8e7a458152d8dca1b7e1335bcb

IMPALA-5696: Enable cipher configuration when using TLS / Thrift
The 'cipher suite' is a description of the set of algorithms used by SSL
and TLS to execute key exchange, encryption, message authentication, and
random number generation functions. SSL implementations allow the cipher
suite to be configured so that ciphers may be removed from the whitelist
if they are shown to be weak.

* Add a flag --ssl_cipher_list which controls cipher selection for both
  thrift servers and clients. Default is blank, which means use all
  available cipher suites.
* Add ThriftServerBuilder to simplify construction of
  ThriftServers (whose constructors were otherwise getting very long).

Testing: new tests added to thrift-server-test. Test cases added follow:

* A client cannot connect to a server which does not have any ciphers in
  common with it.
* If ciphers are identical on clients and servers, that ssl connections
  can be made.
* Bad cipher strings lead to errors on both client and server.


> Enable cipher configuration when using TLS w/Thrift
> ---
>
> Key: IMPALA-5696
> URL: https://issues.apache.org/jira/browse/IMPALA-5696
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Thrift's {{TSSLSocketFactory}} has a {{cipher()}} method that we can use to 
> configure the ciphers used by OpenSSL. We just need to connect it up to a 
> flag that the user provides. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5774) StringFunctions::FindInSet() may read one byte beyond a string's extent

2017-08-08 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5774.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/5caadbbedd1917019937290e9427fd6f798f0cd8

> StringFunctions::FindInSet() may read one byte beyond a string's extent
> ---
>
> Key: IMPALA-5774
> URL: https://issues.apache.org/jira/browse/IMPALA-5774
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> The following may read {{str_set.ptr[str_set.len]}} if no ',' is found.
> {code}
> while(str_set.ptr[end] != ',' && end < str_set.len) ++end;
> {code}
> (This was discovered by poisoning mempool data from IMPALA-5666).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5775) Impala shell only supports TLSv1

2017-08-07 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5775:
--

 Summary: Impala shell only supports TLSv1
 Key: IMPALA-5775
 URL: https://issues.apache.org/jira/browse/IMPALA-5775
 Project: IMPALA
  Issue Type: Bug
  Components: Clients
Reporter: Henry Robinson
Assignee: Henry Robinson


Per https://docs.python.org/2/library/ssl.html, we have Impala shell's SSL 
client configured only to connect using TLSv1. That is, if after IMPALA-5743, 
it tries to connect to a TLSv1_2 server, it won't work.

We should change the client protocol to {{SSLv23}} (I think this is acceptable 
for a client - the server won't negotiate an SSL connection), which can connect 
to all flavours of TLS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5774) StringFunctions::FindInSet() may read one byte beyond a string's extent

2017-08-07 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5774:
--

 Summary: StringFunctions::FindInSet() may read one byte beyond a 
string's extent
 Key: IMPALA-5774
 URL: https://issues.apache.org/jira/browse/IMPALA-5774
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


The following may read {{str_set.ptr[str_set.len]}} if no ',' is found.

{code}
while(str_set.ptr[end] != ',' && end < str_set.len) ++end;
{code}

(This was discovered by poisoning mempool data from IMPALA-5666).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5742) Memory leak in parquet-reader

2017-08-04 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5742.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/b55ec3f64f2a16259d4c5cd2e881701fee4c603f

> Memory leak in parquet-reader
> -
>
> Key: IMPALA-5742
> URL: https://issues.apache.org/jira/browse/IMPALA-5742
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Jim Apple
>Assignee: Henry Robinson
>Priority: Minor
>  Labels: newbie
> Fix For: Impala 2.10.0
>
>
> Line 209 of parquet-reader {{malloc}}s memory it never frees, breaking ASAN 
> tests on https://jenkins.impala.io:
> {noformat}
>  TestHdfsParquetTableWriter.test_def_level_encoding[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> [gw0] linux2 -- Python 2.7.6 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> query_test/test_insert_parquet.py:228: in test_def_level_encoding
> os.path.join(tmp_dir, str(f))])
> /usr/lib/python2.7/subprocess.py:540: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 
> '['/home/ubuntu/Impala/be/build/debug/util/parquet-reader', '--file', 
> '/tmp/tmpbnxrl3/8948dc471cad29c8-45c9c8180003_942829264_data.0.parq']' 
> returned non-zero exit status 1
> {noformat}
> {noformat}
> ERROR: LeakSanitizer: detected memory leaks
> Direct leak of 43833 byte(s) in 1 object(s) allocated from:
> #0 0x1065588 in __interceptor_malloc 
> /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:52
> #1 0x109b42c in main 
> /home/ubuntu/Impala/be/src/util/parquet-reader.cc:209:48
> #2 0x7f08e0557f44 in __libc_start_main 
> (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
> SUMMARY: AddressSanitizer: 43833 byte(s) leaked in 1 allocation(s).
> -- executing against localhost:21000
> drop table test_def_level_encoding_54e4df6c.test_hdfs_parquet_table_writer;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5758) Enable LSAN for tests

2017-08-02 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5758:
--

 Summary: Enable LSAN for tests
 Key: IMPALA-5758
 URL: https://issues.apache.org/jira/browse/IMPALA-5758
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


[LSAN|https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer] 
support would be good to catch any leaks. It works well and quickly, but has a 
number of false or inactionable positives, mostly in the JVM. We can suppress 
those via a configuration file, and enable LSAN during our ASAN runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5743) Allow for configuration of TLS / SSL versions

2017-07-31 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5743:
--

 Summary: Allow for configuration of TLS / SSL versions
 Key: IMPALA-5743
 URL: https://issues.apache.org/jira/browse/IMPALA-5743
 Project: IMPALA
  Issue Type: Improvement
  Components: Security
Affects Versions: Impala 2.8.0, Impala 2.7.0, Impala 2.9.0, Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


It would be good for users to be able, via the command line, to specify 
acceptable TLS protocols.

Users will typically want to specify a minimum protocol version (i.e. TLS1.0, 
1.1 or 1.2), rather than a specific protocol version. Kudu has 
{{--rpc_tls_minimum_version}}, and we can follow their lead. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5716) Switching to / from distcc can delete cmake_modules/*

2017-07-28 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5716.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/41e3055f925093f971a3a800ae9601728ff9e37c

> Switching to / from distcc can delete cmake_modules/*
> -
>
> Key: IMPALA-5716
> URL: https://issues.apache.org/jira/browse/IMPALA-5716
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> If {{$IMPALA_HOME}} ends with a /, the {{clean_cmake_files}} function in 
> {{distcc_env.sh}} will emit a {{find}} command with a double // at the end 
> for the {{cmake_modules}} directory, and since it contains the substring 
> {{cmake}}, {{find}} will match and delete its contents.
> Fix is to strip trailing /s from IMPALA_HOME in that method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5729) GVO sometimes failing due to apparent kudu tablet server crash

2017-07-26 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5729:
--

 Summary: GVO sometimes failing due to apparent kudu tablet server 
crash
 Key: IMPALA-5729
 URL: https://issues.apache.org/jira/browse/IMPALA-5729
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Priority: Critical


See e.g. https://jenkins.impala.io/job/gerrit-verify-dryrun/937/consoleFull. 

{code}
00:44:28 ] E   HiveServer2Error: AnalysisException: Error opening Kudu table 
'impala::tpch_kudu.lineitem', Kudu error: can not complete before timeout: 
KuduRpc(method=GetTableSchema, tablet=null, attempt=94, 
DeadlineTracker(timeout=18, elapsed=179403), Traces: [0ms] querying master, 
[0ms] Sub rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, 
[0ms] Sub rpc: ConnectToMaster received from server master-127.0.0.1:7051 
response Network error: [peer master-127.0.0.1:7051] connection closed, [1ms] 
delaying RPC due to Service unavailable: Master config (127.0.0.1:7051) has no 
leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [22ms] querying master, [22ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [22ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [23ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [42ms] querying master, [42ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [42ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [43ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [62ms] querying master, [63ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [63ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [63ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [82ms] querying master, [82ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [82ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [83ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [102ms] querying master, [102ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [103ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [103ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [162ms] querying master, [162ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [162ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [163ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [242ms] querying master, [242ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [242ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [243ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [362ms] querying master, [362ms] Sub 
rpc: ConnectToMaster sending RPC to server master-127.0.0.1:7051, [362ms] Sub 
rpc: ConnectToMaster received from server master-127.0.0.1:7051 response 
Network error: [peer master-127.0.0.1:7051] connection closed, [363ms] delaying 
RPC due to Service unavailable: Master config (127.0.0.1:7051) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: [peer 
master-127.0.0.1:7051] connection closed, [763ms] q

[jira] [Created] (IMPALA-5719) ODR violation in UDF tests

2017-07-25 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5719:
--

 Summary: ODR violation in UDF tests
 Key: IMPALA-5719
 URL: https://issues.apache.org/jira/browse/IMPALA-5719
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Priority: Minor


At first glance this looks to me like a test linking error (because it includes 
both {{ImpalaUdf}} and {{Udf}} libraries).

{code}
03:59:36 ==42374==ERROR: AddressSanitizer: odr-violation (0x2acb91844a00):
03:59:36   [1] size=4 'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' 
/home/ubuntu/Impala/be/src/udf/udf.cc:121:32
03:59:36   [2] size=4 'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' 
/home/ubuntu/Impala/be/src/udf/udf.cc:121:32
03:59:36 These globals were registered at these points:
03:59:36   [1]:
03:59:36 #0 0x7d1f26 in __asan_register_globals 
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_globals.cc:218
03:59:36 #1 0x2acb9183a36b in asan.module_ctor 
(/home/ubuntu/Impala/be/build/debug/udf/libImpalaUdf.so+0x1936b)
03:59:36 
03:59:36   [2]:
03:59:36 #0 0x7d1f26 in __asan_register_globals 
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_globals.cc:218
03:59:36 #1 0x2acb95260d2b in asan.module_ctor 
(/home/ubuntu/Impala/be/build/debug/udf/libUdf.so+0x1fd2b)
03:59:36 
03:59:36 ==42374==HINT: if you don't care about these errors you may set 
ASAN_OPTIONS=detect_odr_violation=0
03:59:36 SUMMARY: AddressSanitizer: odr-violation: global 
'impala::FunctionContextImpl::VARARGS_BUFFER_ALIGNMENT' at 
/home/ubuntu/Impala/be/src/udf/udf.cc:121:32
03:59:36 ==42374==ABORTING
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5709) Remove mini-impala-cluster

2017-07-25 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5709.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/d2d7328dd3aa1051bcb5329c5bea8bdc1850d281

> Remove mini-impala-cluster
> --
>
> Key: IMPALA-5709
> URL: https://issues.apache.org/jira/browse/IMPALA-5709
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> As far as I know, {{mini-impala-cluster}} isn't used by any tests, nor does 
> any developer I know use it. Let's remove it - better to run real Impala 
> processes locally anyhow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5716) Switching to / from distcc can delete cmake_modules/*

2017-07-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5716:
--

 Summary: Switching to / from distcc can delete cmake_modules/*
 Key: IMPALA-5716
 URL: https://issues.apache.org/jira/browse/IMPALA-5716
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor


If {{$IMPALA_HOME}} ends with a /, the {{clean_cmake_files}} function in 
{{distcc_env.sh}} will emit a {{find}} command with a double // at the end for 
the {{cmake_modules}} directory, and since it contains the substring {{cmake}}, 
{{find}} will match and delete its contents.

Fix is to strip trailing /s from IMPALA_HOME in that method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-4905) Fragments always report insert status, even if not insert query

2017-07-24 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4905.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/d25db64f0e17092af9ef60eb37ec9214900c2d1c

> Fragments always report insert status, even if not insert query
> ---
>
> Key: IMPALA-4905
> URL: https://issues.apache.org/jira/browse/IMPALA-4905
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.7.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> {code}
> if (done) {
>   TInsertExecStatus insert_status;
>   if (runtime_state->hdfs_files_to_move()->size() > 0) {
> 
> insert_status.__set_files_to_move(*runtime_state->hdfs_files_to_move());
>   }
>   if (runtime_state->per_partition_status()->size() > 0) {
> 
> insert_status.__set_per_partition_status(*runtime_state->per_partition_status());
>   }
>   params.__set_insert_exec_status(insert_status);
> }
> {code}
> This means that any fragment will always set {{insert_exec_status}} in its 
> response, even if it's not an INSERT query.
> However, in the RPC handler, {{Coordinator::UpdateFragmentExecStatus()}}, we 
> have:
> {code}
> if (params.done && params.__isset.insert_exec_status) {
> lock_guard l(lock_);
> // Merge in table update data (partitions written to, files to be moved 
> as part of
> // finalization)
> for (const PartitionStatusMap::value_type& partition:
>  params.insert_exec_status.per_partition_status) {
> // etc
> {code}
> which means that the RPC will always try and take the query exec state lock, 
> for every 'done' report. With lots of fragment instances, this can lead to 
> some severe serialisation of reports when the query finishes.
> The simplest workaround is not to set {{insert_exec_status}} for {{SELECT}} 
> queries. But a better solution (that will help INSERTs as well) is not to try 
> and do the merge here, but instead in 
> {{Coordinator::FinalizeSuccessfulInsert()}}, saving the {{TInsertExecStatus}} 
> in the fragment instance state until that point.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5709) Remove mini-impala-cluster

2017-07-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5709:
--

 Summary: Remove mini-impala-cluster
 Key: IMPALA-5709
 URL: https://issues.apache.org/jira/browse/IMPALA-5709
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Priority: Minor


As far as I know, {{mini-impala-cluster}} isn't used by any tests, nor does any 
developer I know use it. Let's remove it - better to run real Impala processes 
locally anyhow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5703) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO

2017-07-24 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5703.

Resolution: Duplicate

Dupe of IMPALA-5702

> TestAdmissionControllerStress::test_admission_controller_with_flags fails 
> intermittently in GVO
> ---
>
> Key: IMPALA-5703
> URL: https://issues.apache.org/jira/browse/IMPALA-5703
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>
> For example: 
> https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console
> {{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries:
>  30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none | round_robin_submission: True] FAILED}}
> {code}
> 09:19:30 ] Thread-3: ImpalaBeeswaxException:
> 09:19:30 ]  INNER EXCEPTION: 
> 09:19:30 ]  MESSAGE: std::bad_cast
> 09:19:30 ] Traceback (most recent call last):
> 09:19:30 ]   File 
> "/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line 
> 592, in run
> 09:19:30 ] raise e
> 09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException:
> 09:19:30 ]  INNER EXCEPTION: 
> 09:19:30 ]  MESSAGE: std::bad_cast
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5703) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO

2017-07-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5703:
--

 Summary: 
TestAdmissionControllerStress::test_admission_controller_with_flags fails 
intermittently in GVO
 Key: IMPALA-5703
 URL: https://issues.apache.org/jira/browse/IMPALA-5703
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson


For example: 
https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console

{{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries:
 30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none | round_robin_submission: True] FAILED}}

{code}
09:19:30 ] Thread-3: ImpalaBeeswaxException:
09:19:30 ]  INNER EXCEPTION: 
09:19:30 ]  MESSAGE: std::bad_cast
09:19:30 ] Traceback (most recent call last):
09:19:30 ]   File 
"/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line 
592, in run
09:19:30 ] raise e
09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException:
09:19:30 ]  INNER EXCEPTION: 
09:19:30 ]  MESSAGE: std::bad_cast
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5702) TestAdmissionControllerStress::test_admission_controller_with_flags fails intermittently in GVO

2017-07-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5702:
--

 Summary: 
TestAdmissionControllerStress::test_admission_controller_with_flags fails 
intermittently in GVO
 Key: IMPALA-5702
 URL: https://issues.apache.org/jira/browse/IMPALA-5702
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson


For example: 
https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/922/console

{{custom_cluster/test_admission_controller.py::TestAdmissionControllerStress::test_admission_controller_with_flags[num_queries:
 30 | submission_delay_ms: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none | round_robin_submission: True] FAILED}}

{code}
09:19:30 ] Thread-3: ImpalaBeeswaxException:
09:19:30 ]  INNER EXCEPTION: 
09:19:30 ]  MESSAGE: std::bad_cast
09:19:30 ] Traceback (most recent call last):
09:19:30 ]   File 
"/home/ubuntu/Impala/tests/custom_cluster/test_admission_controller.py", line 
592, in run
09:19:30 ] raise e
09:19:30 ] ImpalaBeeswaxException: ImpalaBeeswaxException:
09:19:30 ]  INNER EXCEPTION: 
09:19:30 ]  MESSAGE: std::bad_cast
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5532) Don't heap-allocate compressor objects in RowBatch

2017-07-24 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5532.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/f3d8ccdf0f19b0b4077df517cf604a863c55bb37

> Don't heap-allocate compressor objects in RowBatch
> --
>
> Key: IMPALA-5532
> URL: https://issues.apache.org/jira/browse/IMPALA-5532
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Every call to {{RowBatch::RowBatch(..., const TRowBatch&, ...)}} or 
> {{RowBatch::Serialize()}} creates a (de)compressor object. That uses the 
> {{Codec::CreateCompressor()}} interface which returns a pointer to a 
> {{Codec}} object, so that the virtual compression interface can be used.
> However, we always use LZ4 compression, and so needlessly heap-allocate the 
> compressor objects to get the advantage of implementation hiding which we 
> don't actually need. We should just declare a stack-allocated LZ4 
> (de)compressor when needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5688) Speed up a couple of heavy-hitting expr-tests

2017-07-22 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5688.

Resolution: Fixed

https://github.com/apache/incubator-impala/commit/1653419bd8b3748bbc0e3d5e7ffa1d412bc4b50f

> Speed up a couple of heavy-hitting expr-tests
> -
>
> Key: IMPALA-5688
> URL: https://issues.apache.org/jira/browse/IMPALA-5688
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> Two tests ({{LongReverse}} and the base64 tests in {{StringFunctions}}) run 
> their tests over all lengths from 0..{{some length}}. Both take several 
> minutes to complete. This adds a lot of runtime for not much more confidence. 
> If instead we pick a set of 'interesting' (including powers-of-two, prime 
> numbers, edge-cases) lengths, we can get a similar amount of confidence while 
> significantly reducing the runtime of expr-test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-3937) Deprecate --be_service_threads

2017-07-22 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-3937.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/ed7324431d16a37a279d730a036197fc9019c3ce

> Deprecate --be_service_threads
> --
>
> Key: IMPALA-3937
> URL: https://issues.apache.org/jira/browse/IMPALA-3937
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
>  Labels: newbie
> Fix For: Impala 2.10.0
>
>
> {{be_service_threads}} hasn't done anything in probably 4+ years. We should 
> deprecate it (in the flags text) and stop referring to it in the code (it's 
> passed as a constructor parameter to {{ThriftServer}}, but it doesn't have 
> any effect there).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-3655) Upgrade Thrift dependency to 0.9.2 or 0.9.3

2017-07-21 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-3655.

Resolution: Duplicate

Just discovered this, which is a dupe of the ticket I filed yesterday 
(IMPALA-5690) - I put all my notes there, so closing this one.

> Upgrade Thrift dependency to 0.9.2 or 0.9.3
> ---
>
> Key: IMPALA-3655
> URL: https://issues.apache.org/jira/browse/IMPALA-3655
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>
> We should upgrade Thrift to pull in some needed bugfixes and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5696) Enable cipher configuration when using TLS w/Thrift

2017-07-21 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5696:
--

 Summary: Enable cipher configuration when using TLS w/Thrift
 Key: IMPALA-5696
 URL: https://issues.apache.org/jira/browse/IMPALA-5696
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Affects Versions: Impala 2.8.0, Impala 2.6.0, Impala 2.7.0, Impala 2.9.0
Reporter: Henry Robinson


Thrift's {{TSSLSocketFactory}} has a {{cipher()}} method that we can use to 
configure the ciphers used by OpenSSL. We just need to connect it up to a flag 
that the user provides. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5690) Upgrade Thrift version to 0.9.3

2017-07-20 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5690:
--

 Summary: Upgrade Thrift version to 0.9.3
 Key: IMPALA-5690
 URL: https://issues.apache.org/jira/browse/IMPALA-5690
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson


There are several good reasons to move from Thrift 0.9.0 to 0.9.3, including 
harmonization with other projects that we link against in one form or another.

I have started to investigate upgrading, and it's not trivial. Here are the 
things I've run into:

1. 0.9.3 defines operator<< for all Thrift structures, conflicting with some of 
our bespoke implementations.
2. {{TAcceptQueueServer}} is written against an old server interface, and needs 
to be updated. 
3. To build on all the platforms that I care about, a modern Bison install is 
necessary (this is an issue for native-toolchain, not Apache Impala).





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5688) Speed up a couple of heavy-hitting expr-tests

2017-07-20 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5688:
--

 Summary: Speed up a couple of heavy-hitting expr-tests
 Key: IMPALA-5688
 URL: https://issues.apache.org/jira/browse/IMPALA-5688
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor
 Fix For: Impala 2.10.0


Two tests ({{LongReverse}} and the base64 tests in {{StringFunctions}}) run 
their tests over all lengths from 0..{{some length}}. Both take several minutes 
to complete. This adds a lot of runtime for not much more confidence. If 
instead we pick a set of 'interesting' (including powers-of-two, prime numbers, 
edge-cases) lengths, we can get a similar amount of confidence while 
significantly reducing the runtime of expr-test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-4925) Coordinator does not cancel fragments if query completes w/limit

2017-07-19 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4925.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/5bb48ed71dc8272fdabac45a33b515cdd0d5f12d

> Coordinator does not cancel fragments if query completes w/limit
> 
>
> Key: IMPALA-4925
> URL: https://issues.apache.org/jira/browse/IMPALA-4925
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> If a plan has a limit, the coordinator will eventually set 
> {{Coordinator::returned_all_results_}} once the limit has been hit. At this 
> point, it should start to cancel fragment instances that are still running. 
> This happens usually either through an explicit cancel RPC, or returning a 
> non-OK status to the heartbeat {{ReportExecStatus()}} RPC. In the limit case, 
> neither happen - the query status is not set to {{\!ok()}} (because the query 
> succeeded!), so there's no 'bad' status to propagate to the fragment instance.
> In many cases this doesn't matter because the cancellation propagates from 
> the top down: the root instance will get closed and go away, and then any 
> senders to that instance will notice and cancel themselves, and so on. But 
> there are plan shapes that mean a lot of CPU time is wasted after the query 
> should have finished, e.g.:
> {code}
> with l as (select 1 from functional.alltypes group by month), r as
>   (select count(*) from lineitem a CROSS JOIN lineitem b)
>   SELECT * from l UNION ALL (select * from r) LIMIT 2{code}
> This convoluted query illustrates the idea: table {{l}} is the left union 
> child, and gets evaluated first. It produces more than two rows, so the limit 
> gets hit. The right child, in the meantime, is evaluating the cross join 
> before the aggregation, which is very cpu heavy. When the limit is hit, the 
> query hangs (from the client's perspective), waiting for the right child to 
> produce no results.
> The fix for this is easy: fragment instances should learn about query 
> termination from {{ReportExecStatus()}} RPCs. If {{results_returned_}} is 
> true, the coordinator should return a non-OK status, causing instance 
> tear-down next time the instance checks its cancellation state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5670) Remove redundant c'tor code from ExecEnv

2017-07-19 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5670.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/ab287955d00939531b5bc6b9871fcb24def9d38e

> Remove redundant c'tor code from ExecEnv
> 
>
> Key: IMPALA-5670
> URL: https://issues.apache.org/jira/browse/IMPALA-5670
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> {{ExecEnv}} has two constructors that do pretty much the same thing. We 
> should use a delegating constructor from one to the other to reduce code 
> duplication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5684) Use gtest's sharding support to parallelise long-running be-tests

2017-07-19 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5684:
--

 Summary: Use gtest's sharding support to parallelise long-running 
be-tests
 Key: IMPALA-5684
 URL: https://issues.apache.org/jira/browse/IMPALA-5684
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Googletest has support for sharding test cases from a single test across 
different processes. We could use this to speed up the execution of some 
backend tests - particularly {{expr-test}}.

The runtime of each expr-test is heavily skewed, but once we have sharding we 
can make the test cases a bit more fine-grained and then automatically get 
better performance as the sharding handles the work balancing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5659) glog / gflags should be dynamically linked if Impala is

2017-07-18 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5659.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Toolchain commit:

https://github.com/cloudera/native-toolchain/commit/f32e122eaa9932f52b7c3f4c205045f3522e88dd

Impala commit:

https://github.com/apache/incubator-impala/commit/d79e01ef9fec559d4ebe57d41539f4e4164ae78f

> glog / gflags should be dynamically linked if Impala is
> ---
>
> Key: IMPALA-5659
> URL: https://issues.apache.org/jira/browse/IMPALA-5659
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> The glog and gflags libraries are currently always statically linked against 
> Impala, whether or not BUILD_SHARED_LIBS is true.
> However, that can cause a problem if one of our libraries itself tries to 
> link against glog or gflags: the google library will be linked twice in the 
> final binary, and that causes problems for these particular libraries that 
> require that they are linked at most once.
> The proposed fix is to dynamically link glog and gflags if BUILD_SHARED_LIBS 
> is true. 
> This is not an issue in our current code, but making this fix future-proofs 
> us against running into the problem later (and the kudu util library has this 
> exact issue, as it tries to link against glog directly).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5673) Track exchange node buffers memory as part of memory reservation

2017-07-17 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5673.

Resolution: Duplicate

Duplicate of IMPALA-5485 (feel free to move that to a sub-task if you need for 
tracking).

> Track exchange node buffers memory as part of memory reservation 
> -
>
> Key: IMPALA-5673
> URL: https://issues.apache.org/jira/browse/IMPALA-5673
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>
> Queries with a large number of exchange operators end up with untracked 
> memory in the form of buffer space allocated per DataStreamRecvr.
> exchg_node_buffer_size_bytes can be used to calculate how much memory will be 
> used by DataStreamRecvr. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5671) Union node may evaluate all children even if limit is reached

2017-07-17 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5671:
--

 Summary: Union node may evaluate all children even if limit is 
reached
 Key: IMPALA-5671
 URL: https://issues.apache.org/jira/browse/IMPALA-5671
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson


The loop inside {{UnionNode::GetNextMaterialized()}} does not break if the 
limit has been reached. See 
[here|https://github.com/apache/incubator-impala/blob/master/be/src/exec/union-node.cc#L193].
 The only way the loop can be broken is if either the children are exhausted, 
or the current row batch becomes full.

If you have a union node with a limit of 1, and two children - the first of 
which is very cheap to evaluate and returns one row, but the second is very 
expensive - the union node will try to fill an entire row batch with rows, and 
end up waiting on the second child, even though the node could be finished 
after reading one row from the first child.

The result is a query that takes much longer to complete than it should. Here's 
an example:

{code}
with l as (select 1 from functional.alltypes group by month), r as
  (select count(*) from lineitem a CROSS JOIN lineitem b)
  SELECT * from l UNION ALL (select * from r) LIMIT 2
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5670) Remove redundant c'tor code from ExecEnv

2017-07-17 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5670:
--

 Summary: Remove redundant c'tor code from ExecEnv
 Key: IMPALA-5670
 URL: https://issues.apache.org/jira/browse/IMPALA-5670
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor


{{ExecEnv}} has two constructors that do pretty much the same thing. We should 
use a delegating constructor from one to the other to reduce code duplication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5659) glog / gflags should be dynamically linked if Impala is

2017-07-13 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5659:
--

 Summary: glog / gflags should be dynamically linked if Impala is
 Key: IMPALA-5659
 URL: https://issues.apache.org/jira/browse/IMPALA-5659
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor


The glog and gflags libraries are currently always statically linked against 
Impala, whether or not BUILD_SHARED_LIBS is true.

However, that can cause a problem if one of our libraries itself tries to link 
against glog or gflags: the google library will be linked twice in the final 
binary, and that causes problems for these particular libraries that require 
that they are linked at most once.

The proposed fix is to dynamically link glog and gflags if BUILD_SHARED_LIBS is 
true. 

This is not an issue in our current code, but making this fix future-proofs us 
against running into the problem later (and the kudu util library has this 
exact issue, as it tries to link against glog directly).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5481) RowDescriptors should be shared, rather than copied

2017-06-20 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5481.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/317c413a00bd9b3b29eeaf2efe556c2e924e2d74

> RowDescriptors should be shared, rather than copied
> ---
>
> Key: IMPALA-5481
> URL: https://issues.apache.org/jira/browse/IMPALA-5481
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> One of the {{RowBatch}} c'tors copies the row descriptor into the row batch. 
> This leads to a lot of allocation churn since {{RowDescriptor}} contains some 
> vector members, and since the descriptor is usually the same the copies are 
> unnecessary.
> Instead, we should consider allocating the {{RowDescriptor}} once from an 
> object pool, and sharing it amongst all row batches that need that 
> descriptor. 
> In some tests, {{RowDescriptor()}} shows up as 20% of the tcmalloc allocation 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-1514) DataStreamSender (and possible Coordinator) has too many sender threads

2017-06-19 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-1514.

   Resolution: Duplicate
Fix Version/s: Product Backlog

Covered by IMPALA-2567

> DataStreamSender (and possible Coordinator) has too many sender threads
> ---
>
> Key: IMPALA-1514
> URL: https://issues.apache.org/jira/browse/IMPALA-1514
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.0
>Reporter: Alan Choi
>Assignee: Henry Robinson
>Priority: Minor
>  Labels: performance
> Fix For: Product Backlog
>
>
> DataStreamSender creates one thread per EXCHANGE destination per query. On a 
> large cluster with a highly concurrent workload, this will create too many 
> threads. The immediate impact is that the thread creation time is dominating 
> the query execution time (i.e. the prepare time is getting very high).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5532) Don't heap-allocate compressor objects in RowBatch

2017-06-19 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5532:
--

 Summary: Don't heap-allocate compressor objects in RowBatch
 Key: IMPALA-5532
 URL: https://issues.apache.org/jira/browse/IMPALA-5532
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Every call to {{RowBatch::RowBatch(..., const TRowBatch&, ...)}} or 
{{RowBatch::Serialize()}} creates a (de)compressor object. That uses the 
{{Codec::CreateCompressor()}} interface which returns a pointer to a {{Codec}} 
object, so that the virtual compression interface can be used.

However, we always use LZ4 compression, and so needlessly heap-allocate the 
compressor objects to get the advantage of implementation hiding which we don't 
actually need. We should just declare a stack-allocated LZ4 (de)compressor when 
needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5528) tcmalloc contention much higher with concurrency after KRPC patch

2017-06-16 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5528:
--

 Summary: tcmalloc contention much higher with concurrency after 
KRPC patch
 Key: IMPALA-5528
 URL: https://issues.apache.org/jira/browse/IMPALA-5528
 Project: IMPALA
  Issue Type: Sub-task
  Components: Distributed Exec
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Critical


Our testing has revealed that under high concurrency (e.g. the 
{{many_independent_fragment_instances}} primitive), KRPC slows down execution 
significantly.

This JIRA is to track the overall issue, and to link to JIRAs for specific spot 
fixes. This is the result of running {{perf}} on a node in a 16-node cluster, 
running the {{many_independent_fragment_instances}} primitive.

{code}
-  13.12%  impalad  impalad  [.] 
tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**)
   - tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**)
  - 93.95% tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)
 - tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned 
long)
- 98.16% operator new[](unsigned long)
 29.20% 
impala::RowDescriptor::RowDescriptor(impala::RowDescriptor const&)
 16.85% 
kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >)
 12.58% 
impala::DataStreamRecvr::SenderQueue::AddBatch(std::unique_ptr >&&)
 7.42% 
kudu::rpc::OutboundTransfer::CreateForCallResponse(std::vector > const&, kudu::rpc::TransferCallbacks*)
   + 4.34% impala::Codec::CreateDecompressor(impala::MemPool*, 
bool, impala::THdfsCompression::type, boost::scoped_ptr*)
 4.09% kudu::Trace::Trace()
 3.79% std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&)
   + 3.59% 
kudu::rpc::InboundCall::InboundCall(kudu::rpc::Connection*)
 2.66% void std::vector 
>::_M_emplace_back_aux(impala::MemPool::ChunkInfo&&)
   + 2.57% 
kudu::rpc::Connection::HandleIncomingCall(gscoped_ptr >)
 2.04% std::vector 
>::reserve(unsigned long)
 1.92% 
kudu::rpc::RequestHeader::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*)
 1.91% 
kudu::rpc::RemoteMethodPB::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*)
 1.48% kudu::rpc::Connection::ReadHandler(ev::io&, int)
 0.87% kudu::HeapBufferAllocator::AllocateInternal(unsigned 
long, unsigned long, kudu::BufferAllocator*)
 0.79% kudu::faststring::GrowArray(unsigned long)
 0.72% kudu::rpc::OutboundTransfer::CreateForCallRequest(int, 
std::vector > const&, 
kudu::rpc::TransferCallbacks*)
 0.69% 
kudu::rpc::Connection::QueueOutboundCall(std::shared_ptr
 const&)
 0.69% kudu::ArenaBase::ArenaBase(unsigned long, unsigned 
long)
 0.68% void 
std::vector::Component, 
std::default_delete::Component> >, 
std::allocator::Component, 
std::default_delete::Component> > > 
>::_M_emplace_back_aux >&&)
   21.66% 
kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >)
   19.52% 
impala::TransmitDataResponsePb::~TransmitDataResponsePb()
   15.30% kudu::rpc::InboundCall::~InboundCall()
   5.69% 
kudu::rpc::QueueTransferTask::Run(kudu::rpc::ReactorThread*)
   3.97% std::unordered_map, std::equal_to, std::allocator > 
>::mapped_type EraseKeyReturnValuePtr, 
std::equal_to, std::allocator > >::mapped_type 
EraseKeyReturnValuePtr >)
- 22.12% impala::RowBatch::RowBatch(impala::RowDescriptor const&, 
impala::InboundProtoRowBatch const&, impala::MemTracker*)
 
impala::DataStreamRecvr::SenderQueue::AddBatch(std::unique_ptr >&&)
  20.73% impala::TransmitDataResponsePb::~TransmitDataResponsePb()
  9.98% kudu::rpc::InboundCall::~InboundCall()
  6.32% kudu::rpc::QueueTransferTask::Run(kudu::rpc::ReactorThread*)
  4.20% std::unordered_map, std::equal_to, 
std::allocator > 
>::mapped_type EraseKeyReturnValuePtr, 
std::equal_to, std::allocator > >::mapped_type 
EraseKeyReturnValuePtr

[jira] [Created] (IMPALA-5526) Add krb5 to toolchain

2017-06-16 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5526:
--

 Summary: Add krb5 to toolchain
 Key: IMPALA-5526
 URL: https://issues.apache.org/jira/browse/IMPALA-5526
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson


KRPC adds a compile-time dependency on libkrb5's headers. To guarantee that 
they're available in all build environments, we should add krb5 (from 
http://web.mit.edu/kerberos/dist/index.html) to the toolchain.

libkrb5.so should be dynamically linked by default, to avoid creating a binary 
that has statically linked security dependencies (this is an issue for us at 
Cloudera as a vendor, but also a general antipattern). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5518) Allow row batches to be recycled (rather than reallocated) across datastream recvr threads.

2017-06-15 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5518:
--

 Summary: Allow row batches to be recycled (rather than 
reallocated) across datastream recvr threads.
 Key: IMPALA-5518
 URL: https://issues.apache.org/jira/browse/IMPALA-5518
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.2
Reporter: Henry Robinson


The {{DataStreamSender}} allocates row batches in whatever thread handles the 
{{TransmitData()}} RPC, but then deallocates them in the fragment instance 
thread. 

That is an anti-pattern for tcmalloc. Instead we should see if we can recycle 
the row batches where possible.

We could try to 'pin' row batches to service threads, and give them each a 
thread-local ability to reallocate row batch data - the key is ensuring that 
the deallocations happen on the same thread, so we can't just give each sender 
a list of row batches because that sender may be handled by different service 
pool threads. 

Alternatively we can try to cut down on the number of allocations, but that's 
hard to do with cross-thread coordination.
 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5511) Add process start time to debug web page

2017-06-14 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5511:
--

 Summary: Add process start time to debug web page
 Key: IMPALA-5511
 URL: https://issues.apache.org/jira/browse/IMPALA-5511
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Priority: Minor


It's useful to know when a process last started - particularly if a monitoring 
tool restarts the process automatically. 

There's a metric in the impalad process, but neither the statestore not the 
catalog server have it, and it's not displayed prominently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5506) Help information of query_file option in impala-shell misses stdin description

2017-06-14 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5506.

   Resolution: Fixed
 Assignee: David Xu
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/e2532a96c81ecfa2dc763306e96eb340fb49afe3

> Help information of query_file option in impala-shell misses stdin description
> --
>
> Key: IMPALA-5506
> URL: https://issues.apache.org/jira/browse/IMPALA-5506
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 2.5.0
>Reporter: David Xu
>Assignee: David Xu
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> Help information of query_file option in impala-shell is described as 
> following:
> Execute the queries in the query file , delimited by ;
> But the code of impala-shell supports stdin indicated by -. I tested such 
> case and the results were correct.
> We should add the stdin description to help information of query_file option 
> to guide user to use this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5495) Improve error message if neither --is_coordinator nor --is_executor is set

2017-06-13 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5495.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/11ec9f1958482bbd5dc224f55e409a8ec907f066

> Improve error message if neither --is_coordinator nor --is_executor is set
> --
>
> Key: IMPALA-5495
> URL: https://issues.apache.org/jira/browse/IMPALA-5495
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Trivial
>  Labels: bugbash-2017-05-31
> Fix For: Impala 2.10.0
>
>
> If neither {{is_coordinator}} nor {{is_executor}} are set, you get this 
> message {{Impala server needs to have a role (EXECUTOR, COORDINATOR)}} - 
> which isn't actionable. We should mention the flags at least.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5495) Improve error message if neither --is_coordinator nor --is_executor is set

2017-06-12 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5495:
--

 Summary: Improve error message if neither --is_coordinator nor 
--is_executor is set
 Key: IMPALA-5495
 URL: https://issues.apache.org/jira/browse/IMPALA-5495
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Priority: Trivial


If neither {{is_coordinator}} nor {{is_executor}} are set, you get this message 
{{Impala server needs to have a role (EXECUTOR, COORDINATOR)}} - which isn't 
actionable. We should mention the flags at least.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5493) Add Protobuf headers to Impala-lzo

2017-06-12 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5493:
--

 Summary: Add Protobuf headers to Impala-lzo
 Key: IMPALA-5493
 URL: https://issues.apache.org/jira/browse/IMPALA-5493
 Project: IMPALA
  Issue Type: Sub-task
  Components: Distributed Exec
Reporter: Henry Robinson
Assignee: Henry Robinson


LZO now depends on Protobuf headers (transitively) - see:

https://github.com/henryr/impala-lzo/commit/861e6b68011181257816c990465fca15250fcfa5

for a commit to include them. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5133) Concurrent TPC-DS queries get stuck and stop making progress, new queries

2017-06-12 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5133.

Resolution: Fixed

Believe this to be caused by KUDU-2041. Will re-open if we see it again.

> Concurrent TPC-DS queries get stuck and stop making progress, new queries 
> --
>
> Key: IMPALA-5133
> URL: https://issues.apache.org/jira/browse/IMPALA-5133
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
>Priority: Critical
> Attachments: impalad.INFO.zip, stuck_queries_thread_dump_2.txt, 
> stuck_queries_thread_dump.txt, TPC-DS 2.zip, TPCDS-Concurrency-20Node.jmx
>
>
> Concurrent queries against 10GB TPC-DS using 20 concurrent eventually get 
> stuck and don't make progress.
> Attached impalad log and thread dump. 
> Jmeter JMX file used to run the workload is also attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5486) Port control-plane parts of ImpalaInternalService to KRPC

2017-06-12 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5486:
--

 Summary: Port control-plane parts of ImpalaInternalService to KRPC
 Key: IMPALA-5486
 URL: https://issues.apache.org/jira/browse/IMPALA-5486
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Henry Robinson






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-5485) Exchg / data-stream recv-side buffers should be tracked by query mem-tracker

2017-06-12 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5485:
--

 Summary: Exchg / data-stream recv-side buffers should be tracked 
by query mem-tracker
 Key: IMPALA-5485
 URL: https://issues.apache.org/jira/browse/IMPALA-5485
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Reporter: Henry Robinson


Exchange nodes assign a fixed-size buffer to their datastream receivers that's 
used to smooth out differences in send / consume rates between the sender and 
the receiver.

These buffers should be tracked by the query memtracker, and with the new 
min-reservation support we should allow them to be larger than the configured 
minimum. Increasing the buffer size decreases the amount of time that a sender 
can be blocked on a receiver, and so increases query-parallelism. Queries that 
shuffle a lot of data can see significant speedups from larger buffers.

The buffers need to be sized based on the #receivers and the #rows * #avg row 
size. They can dynamically expand trivially - contraction is possible, but a 
bit harder.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-5480) Missing filters message isn't great

2017-06-09 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5480.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/fa174fc962c174598ee41e558aff33698753d9f5

> Missing filters message isn't great
> ---
>
> Key: IMPALA-5480
> URL: https://issues.apache.org/jira/browse/IMPALA-5480
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 2.7.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Trivial
> Fix For: Impala 2.10.0
>
>
> If a runtime filter doesn't arrive at a scan node, the message text is a bit 
> hard to read:
> {{Only following filters arrived: , waited 10ms}}
> Let's change it to make clear that 0 filters have arrived, and which ones 
> *have* shown up.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5345) Under stress, some TransmitData() RPCs are not responded to

2017-06-09 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5345.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Can't reproduce this on either a 20-node or 140-node cluster since I fixed 
KUDU-2041. My guess is that connection deadlock looked like a half-complete RPC 
failure. Have to run stress as well to confirm, but current indication is that 
this is fixed.

> Under stress, some TransmitData() RPCs are not responded to
> ---
>
> Key: IMPALA-5345
> URL: https://issues.apache.org/jira/browse/IMPALA-5345
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 2.10.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Critical
> Fix For: Impala 2.10.0
>
>
> Under stress conditions on two separate clusters (one secure, one not), I've 
> seen some {{TransmitData()}} RPCs stay unresponded to forever, blocking the 
> query's completion. The RPCs are seen by the recipient, but are not in the 
> pending sender list.
> Need to test further to see if this is related to the fix for IMPALA-5093 or 
> if a response is dropped on some path if an row batch is 'retried' from the 
> pending sender list. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5481) RowBatches should share RowDescriptor where possible

2017-06-09 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5481:
--

 Summary: RowBatches should share RowDescriptor where possible
 Key: IMPALA-5481
 URL: https://issues.apache.org/jira/browse/IMPALA-5481
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson


One of the {{RowBatch}} c'tors copies the row descriptor into the row batch. 
This leads to a lot of allocation churn since {{RowDescriptor}} contains some 
vector members, and since the descriptor is usually the same the copies are 
unnecessary.

Instead, we should consider allocating the {{RowDescriptor}} once from an 
object pool, and sharing it amongst all row batches that need that descriptor. 

In some tests, {{RowDescriptor()}} shows up as 20% of the tcmalloc allocation 
time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5480) Missing filters message isn't great

2017-06-09 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5480:
--

 Summary: Missing filters message isn't great
 Key: IMPALA-5480
 URL: https://issues.apache.org/jira/browse/IMPALA-5480
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.7.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Trivial


If a runtime filter doesn't arrive at a scan node, the message text is a bit 
hard to read:

{{Only following filters arrived: , waited 10ms}}

Let's change it to make clear that 0 filters have arrived, and which ones 
*have* shown up.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-4892) Include the session ID in the "Invalid session ID" error message

2017-06-08 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4892.

Resolution: Fixed

Fixed in 
https://github.com/apache/incubator-impala/commit/eea4ad7caa8cf6ab7cea125e9564392d63ea2c27.

Thanks for the contribution [~sjc362000]!

> Include the session ID in the "Invalid session ID" error message
> 
>
> Key: IMPALA-4892
> URL: https://issues.apache.org/jira/browse/IMPALA-4892
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Stephen Carlin
>  Labels: newbie
>
> When {{GetSessionState()}} can't find the session, the error message should 
> include the ID that was wrong.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5435) test_basic_filters failed on ASAN

2017-06-08 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5435.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

Fixed (by increasing timeouts) in 
https://github.com/apache/incubator-impala/commit/1886da45e87209c0d625554462d68e2b44bb

> test_basic_filters failed on ASAN
> -
>
> Key: IMPALA-5435
> URL: https://issues.apache.org/jira/browse/IMPALA-5435
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Henry Robinson
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 2.10.0
>
>
> Seen in an ASAN Jenkins build:
> {noformat}
> 02:45:29  TestRuntimeFilters.test_basic_filters[exec_option: 
> {'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
> table_format: rc/snap/block] 
> 02:45:29 [gw3] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python
> 02:45:29 query_test/test_runtime_filters.py:39: in test_basic_filters
> 02:45:29 self.run_test_case('QueryTest/runtime_filters', vector)
> 02:45:29 common/impala_test_suite.py:430: in run_test_case
> 02:45:29 verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
> result.runtime_profile)
> 02:45:29 common/test_result_verifier.py:560: in verify_runtime_profile
> 02:45:29 actual))
> 02:45:29 E   AssertionError: Did not find matches for lines in runtime 
> profile:
> 02:45:29 E   EXPECTED LINES:
> 02:45:29 E   row_regex: .*Files rejected: 7 .*
> 02:45:29 E   
> 02:45:29 E   ACTUAL PROFILE:
> 02:45:29 E   Query (id=364393521d6edaa6:82f92a03):
> 02:45:29 E DEBUG MODE WARNING: Query profile created while running a 
> DEBUG build of Impala. Use RELEASE builds to measure query performance.
> 02:45:29 E Summary:
> 02:45:29 E   Session ID: be475affeee5db0d:e52699dc51ac26ae
> 02:45:29 E   Session Type: BEESWAX
> 02:45:29 E   Start Time: 2017-06-05 00:31:12.430322000
> 02:45:29 E   End Time: 
> 02:45:29 E   Query Type: QUERY
> 02:45:29 E   Query State: FINISHED
> 02:45:29 E   Query Status: OK
> 02:45:29 E   Impala Version: impalad version 2.9.0-SNAPSHOT DEBUG (build 
> cde19ab8c7801436070ce0438e28d5042265dfd1)
> 02:45:29 E   User: jenkins
> 02:45:29 E   Connected User: jenkins
> 02:45:29 E   Delegated User: 
> 02:45:29 E   Network Address: 127.0.0.1:40832
> 02:45:29 E   Default Db: functional_rc_snap
> 02:45:29 E   Sql Statement: with t1 as (select month x, bigint_col y from 
> alltypes limit 7300),
> 02:45:29 Et2 as (select int_col x, bigint_col y from alltypestiny 
> limit 2)
> 02:45:29 Eselect count(*) from t1, t2 where t1.x = t2.x
> 02:45:29 E   Coordinator: 
> impala-boost-static-burst-slave-1fc7.vpc.cloudera.com:22000
> 02:45:29 E   Query Options (non default): 
> ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=15000
> 02:45:29 E   Plan: 
> 02:45:29 E   
> 02:45:29 E   Per-Host Resource Reservation: Memory=136.00MB
> 02:45:29 E   Per-Host Resource Estimates: Memory=138.00MB
> 02:45:29 E   WARNING: The following tables are missing relevant table and/or 
> column statistics.
> 02:45:29 E   functional_rc_snap.alltypes, functional_rc_snap.alltypestiny
> 02:45:29 E   
> 02:45:29 E   F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> 02:45:29 E   PLAN-ROOT SINK
> 02:45:29 E   |  mem-estimate=0B mem-reservation=0B
> 02:45:29 E   |
> 02:45:29 E   03:AGGREGATE [FINALIZE]
> 02:45:29 E   |  output: count(*)
> 02:45:29 E   |  mem-estimate=10.00MB mem-reservation=0B
> 02:45:29 E   |  tuple-ids=4 row-size=8B cardinality=1
> 02:45:29 E   |
> 02:45:29 E   02:HASH JOIN [INNER JOIN, BROADCAST]
> 02:45:29 E   |  hash predicates: month = int_col
> 02:45:29 E   |  runtime filters: RF000 <- int_col
> 02:45:29 E   |  mem-estimate=9B mem-reservation=136.00MB
> 02:45:29 E   |  tuple-ids=0,2 row-size=8B cardinality=7300
> 02:45:29 E   |
> 02:45:29 E   |--06:EXCHANGE [UNPARTITIONED]
> 02:45:29 E   |  |  mem-estimate=0B mem-reservation=0B
> 02:45:29 E   |  |  tuple-ids=2 row-size=4B cardinality=2
> 02:45:29 E   |  |
> 02:45:29 E   |  F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> 02:45:29 E   |  05:EXCHANGE [UNPARTITIONED]
> 02:45:29 E   |  |  limit: 2
> 02:45:29 E   |  |  mem-estimate=0B mem-reservation=0B
> 02:45:29 E   |  |  tuple-ids=2 row-size=4B cardinality=2
> 02:45:29 E   |  |
> 02:45:29 E   |  F02:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> 02:45:29 E   |  01:SCAN HDFS [functional_rc_snap.alltypestiny, RANDOM]
> 02:45:29 E   | partitions=4/4 files=4 size=1.38KB
> 02:45:29 

[jira] [Resolved] (IMPALA-5454) JVM metrics don't show up on /memz sometimes

2017-06-08 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5454.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/88f1f86b3d2f9ed23a94e285bad6bf09f8c80c93

> JVM metrics don't show up on /memz sometimes
> 
>
> Key: IMPALA-5454
> URL: https://issues.apache.org/jira/browse/IMPALA-5454
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Due to a bug in {{mustache.cc}}, {{memz.tmpl}} template rendering fails if 
> the {{buffer_pool}} JSON entry is not generated.
> The bug is that nested template commands with the same key aren't correctly 
> parsed:
> {code}
> {{?b}} {{#b}}  {{/b}} {{/b}}
> {code}
> If '{{b}}' is not present, the parser tries to skip to the closing 
> {code}{{/b}}{code}, but does not take into account nesting and matches the 
> first closing {code}{{/b}}{code}, not the second. Parsing then fails. 
> This JIRA is to track the workaround - rewriting the templates not to use 
> nesting. The [upstream project|https://github.com/HenryR/cpp-mustache] can 
> fix the underlying issue independently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5473) Make diagnosing network issues easier

2017-06-08 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5473:
--

 Summary: Make diagnosing network issues easier
 Key: IMPALA-5473
 URL: https://issues.apache.org/jira/browse/IMPALA-5473
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson


With our current metrics in the profile, it's hard to debug queries that get 
slow throughput from their exchanges. 

The following cases have different causes, but similar symptoms (e.g. a high 
{{InactiveTimer}} in the xchg profile):

1. Downstream sender does not produce rows quickly (perhaps because *its* child 
instances do not produce rows quickly).

2. Downstream sender can not _send_ rows quickly, perhaps because of network 
congestion.

3. Downstream sender does not start producing rows until some time after the 
upstream has started (captured by {{FirstBatchArrivalWaitTime}}).

4. Downstream sender does not close stream until some time after all rows are 
sent.

We should try to improve these metrics so that all the information about who is 
slow, and why, is available clearly in the runtime profile. Distinguishing 
cases 1 and 2 is particularly important.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5056) Impala fails to recover from statestore connection loss while waiting for metadata

2017-06-07 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5056.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/54118010590d8605303715bf570cac18e1d5e64e

> Impala fails to recover from statestore connection loss while waiting for 
> metadata
> --
>
> Key: IMPALA-5056
> URL: https://issues.apache.org/jira/browse/IMPALA-5056
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Balazs Jeszenszky
>Assignee: Henry Robinson
>Priority: Critical
>  Labels: supportability, usability
> Fix For: Impala 2.10.0
>
>
> The following sequence:
> {code:java}
> describe t1;
> shut down statestore
> invalidate metadata t1;
> ***describe t1;
> start ST
> {code}
> makes the marked query hang indefinitely. New queries will work, but this 
> query will be stuck in the planning phase. Trying to cancel the query or open 
> the details will bring down the web UI since it will wait for the lock the 
> query is holding. The only way to kill off these queries is a restart.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5135) KRPC : ReportExecStatus RPC can timeout when deserializing large query profiles due to tcmalloc contention

2017-06-07 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5135.

Resolution: Not A Bug

{{ReportExecStatus}} doesn't use KRPC any more. When we port those RPCs, we'll 
revisit performance but the locking will have changed a lot.

> KRPC : ReportExecStatus RPC can timeout when deserializing large query 
> profiles due to tcmalloc contention
> --
>
> Key: IMPALA-5135
> URL: https://issues.apache.org/jira/browse/IMPALA-5135
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
>Priority: Critical
> Attachments: bottomup.txt, KRPC Performance results - Concurrent 
> TPC-DS Q17 CDH5.12 Vs. KRPC.pdf, top-down.txt.zip
>
>
> Queries with a larger number of fragments can fail with {code}Timed out: 
> ReportExecStatus RPC to 10.17.187.36:22000 timed out after 10.000s 
> (SENT){code}
> Vtune shows that while deserializing the query profile the thread can get 
> stuck in tcmalloc
> {code}
> impalad ! tcmalloc::CentralFreeList::FetchFromOneSpans - [unknown source file]
> impalad ! tcmalloc::CentralFreeList::RemoveRange + 0xc0 - [unknown source 
> file]
> impalad ! tcmalloc::ThreadCache::FetchFromCentralCache + 0x62 - [unknown 
> source file]
> impalad ! operator new + 0x297 - [unknown source file]
> impalad ! __gnu_cxx::new_allocator::allocate + 0x4 - 
> new_allocator.h:104
> impalad ! std::vector std::allocator>::resize + 0x7f3 - stl_vector.h:676
> impalad ! impala::TRuntimeProfileNode::read + 0xe5a - 
> RuntimeProfile_types.cpp:601
> impalad ! impala::TRuntimeProfileTree::read + 0x8ac - 
> RuntimeProfile_types.cpp:982
> impalad ! impala::TReportExecStatusParams::read + 0x156 - 
> ImpalaInternalService_types.cpp:2956
> impalad ! impala::DeserializeThriftMsg + 
> 0xe5 - thrift-util.h:145
> impalad ! impala::DeserializeFromSidecar + 
> 0xb7 - rpc.h:407
> impalad ! impala::ExecControlService::ReportExecStatus + 0x21e - 
> impala-internal-service.cc:148
> impalad ! std::function google::protobuf::Message*, kudu::rpc::RpcContext*)>::operator() + 0x1c - 
> functional:2439
> impalad ! kudu::rpc::GeneratedServiceIf::Handle + 0x188 - service_if.cc:134
> impalad ! impala::ImpalaServicePool::RunThread + 0x241 - 
> impala-service-pool.cc:130
> impalad ! boost::function0::operator() + 0x1a - 
> function_template.hpp:767
> impalad ! impala::Thread::SuperviseThread + 0x20e - thread.cc:325
> impalad ! operator()&, const 
> std::basic_string&, boost::function, impala::Promise int>*), boost::_bi::list0> + 0x5a - bind.hpp:457
> impalad ! boost::_bi::bind_t const&, boost::function, impala::Promise*), 
> boost::_bi::list4, 
> boost::_bi::value, boost::_bi::value (void)>>, boost::_bi::value*>>>::operator() - 
> bind_template.hpp:20
> impalad ! boost::detail::thread_data (*)(std::string const&, std::string const&, boost::function, 
> impala::Promise*), boost::_bi::list4, 
> boost::_bi::value, boost::_bi::value (void)>>, boost::_bi::value*::run + 0x19 - 
> thread.hpp:116
> impalad ! thread_proxy + 0xd9 - [unknown source file]
> libpthread.so.0 ! start_thread + 0xd0 - [unknown source file]
> libc.so.6 ! clone + 0x6c - [unknown source file]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5134) KRPC : Query with 2K fragments on un-secure 16 node cluster failed with ReportExecStatus RPC to 10.20.122.112:22000 timed out after 10.000s (ON_OUTBOUND_QUEUE)

2017-06-07 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5134.

Resolution: Not A Bug

{{ReportExecStatus}} doesn't use KRPC any more. When we port those RPCs, we'll 
revisit performance but the locking will have changed a lot.

> KRPC : Query with 2K fragments on un-secure 16 node cluster failed with 
> ReportExecStatus RPC to 10.20.122.112:22000 timed out after 10.000s 
> (ON_OUTBOUND_QUEUE)
> ---
>
> Key: IMPALA-5134
> URL: https://issues.apache.org/jira/browse/IMPALA-5134
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
>
> Error message varies from run to run 
> EndDataStream RPC to 10.17.193.18:22000 timed out after 10.000s (SENT)
> EndDataStream RPC to 10.17.193.18:22000 timed out after 10.000s 
> (ON_OUTBOUND_QUEUE)
> ExecPlanFragment RPC to 10.20.122.112:22000 timed out after 120.000s (SENT)
> Captured Vtune data while the query was running and noticed that the RPC can 
> spend significantl amount of time in tcmalloc which eventually spins in the 
> kernel, this behavior can lead to unexpected RPC timeouts. 
> {code}
> CPU Time
> 2 of 101: 9.5% (2.435s of 25.740s)
> libc.so.6 ! madvise - [unknown source file]
> impalad ! TCMalloc_SystemRelease + 0x79 - [unknown source file]
> impalad ! tcmalloc::PageHeap::DecommitSpan + 0x20 - [unknown source file]
> impalad ! tcmalloc::PageHeap::MergeIntoFreeList + 0x212 - [unknown source 
> file]
> impalad ! tcmalloc::PageHeap::Delete + 0x23 - [unknown source file]
> impalad ! operator delete + 0x123 - [unknown source file]
> impalad ! ~faststring + 0x15 - faststring.h:54
> impalad ! ~InboundTransfer - transfer.h:65
> impalad ! kudu::DefaultDeleter::operator() - 
> gscoped_ptr.h:145
> impalad ! ~gscoped_ptr_impl + 0x9 - gscoped_ptr.h:228
> impalad ! ~gscoped_ptr - gscoped_ptr.h:318
> impalad ! kudu::rpc::InboundCall::~InboundCall + 0xe7 - inbound_call.cc:51
> impalad ! kudu::DefaultDeleter::operator() + 0x7 - 
> gscoped_ptr.h:145
> impalad ! ~gscoped_ptr_impl + 0x9 - gscoped_ptr.h:228
> impalad ! ~gscoped_ptr - gscoped_ptr.h:318
> impalad ! ~ResponseTransferCallbacks + 0x30 - connection.cc:368
> impalad ! ~ResponseTransferCallbacks - connection.cc:373
> impalad ! kudu::rpc::ResponseTransferCallbacks::NotifyTransferFinished + 0x1e 
> - connection.cc:376
> impalad ! kudu::rpc::OutboundTransfer::SendBuffer + 0x1b9 - transfer.cc:221
> impalad ! kudu::rpc::Connection::WriteHandler + 0x156 - connection.cc:596
> impalad ! ev_invoke_pending + 0x52 - [unknown source file]
> impalad ! ev_run + 0x9c3 - [unknown source file]
> impalad ! ev::loop_ref::run + 0x12 - ev++.h:211
> impalad ! kudu::rpc::ReactorThread::RunThread + 0x3 - reactor.cc:316
> impalad ! boost::function0::operator() + 0x1a - 
> function_template.hpp:767
> impalad ! kudu::Thread::SuperviseThread + 0x1ee - thread.cc:590
> libpthread.so.0 ! start_thread + 0xd0 - [unknown source file]
> libc.so.6 ! clone + 0x6c - [unknown source file]
> {code}
> Query
> {code}
> select /* +straight_join */ count(*),a.c_nationkey, max(b.c_comment) from   
> customer A join /* +shuffle */  customer B on A.c_custkey = B.c_custkey join 
> /* +shuffle */   customer C on c.c_custkey = B.c_custkey join /* +shuffle */  
>  customer D on d.c_custkey = B.c_custkey join /* +shuffle */   customer E on 
> e.c_custkey = B.c_custkey join /* +shuffle */   customer F on f.c_custkey = 
> B.c_custkey join /* +shuffle */   customer G on g.c_custkey = B.c_custkey 
> join /* +shuffle */   customer H on h.c_custkey = B.c_custkey join /* 
> +shuffle */   customer I on i.c_custkey = B.c_custkey join /* +shuffle */   
> customer J on j.c_custkey = B.c_custkey join /* +shuffle */   customer K on 
> k.c_custkey = B.c_custkey join /* +shuffle */   customer L on l.c_custkey = 
> B.c_custkey join /* +shuffle */   customer M on m.c_custkey = B.c_custkey 
> join /* +shuffle */   customer N on n.c_custkey = B.c_custkey join /* 
> +shuffle */   customer O on o.c_custkey = B.c_custkey join /* +shuffle */   
> customer P on p.c_custkey = B.c_custkey join /* +shuffle */   customer R on 
> R.c_custkey = B.c_custkey join /* +shuffle */   customer S on S.c_custkey = 
> B.c_custkey join /* +shuffle */   customer T on T.c_custkey = B.c_custkey 
> join /* +shuffle */   customer U on U.c_custkey = B.c_custkey join /* 
> +shuffle */   customer V on V.c_custkey = B.c_custkey join /* +shuffle */   
> customer W on W.c_custkey = B.c_custkey join /* +shuffle */   customer X on 
> X.c_custkey = B.c_custkey join /* +shuffle */   customer Y on Y.c_custkey = 
> B.c_custkey join /* +shuffle */   customer Z on Z.c_custkey = B.c_custkey 
> join /* +shuffle */   customer

[jira] [Created] (IMPALA-5454) JVM metrics don't show up on /memz sometimes

2017-06-07 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5454:
--

 Summary: JVM metrics don't show up on /memz sometimes
 Key: IMPALA-5454
 URL: https://issues.apache.org/jira/browse/IMPALA-5454
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Due to a bug in {{mustache.cc}}, {{memz.tmpl}} template rendering fails if the 
{{buffer_pool}} JSON entry is not generated.

The bug is that nested template commands with the same key aren't correctly 
parsed:

{code}
{{?b}} {{#b}}  {{/b}} {{/b}}
{code}

If '{{b}}' is not present, the parser tries to skip to the closing 
{code}{{/b}}{code}, but does not take into account nesting and matches the 
first closing {code}{{/b}}{code}}, not the second. Parsing then fails. 

This JIRA is to track the workaround - rewriting the templates not to use 
nesting. The [upstream project|https://github.com/HenryR/cpp-mustache] can fix 
the underlying issue independently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5450) Impala web UI /varz?raw returns HTML content

2017-06-07 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5450.

Resolution: Not A Bug

This is working as intended - the {{raw}} argument really just serves to change 
the content-type. I think this did change around Impala 2.7 to make the 
behaviour more consistent, and several web pages were changed to use templates 
({{/varz}} amongst them). 

> Impala web UI /varz?raw returns HTML content
> 
>
> Key: IMPALA-5450
> URL: https://issues.apache.org/jira/browse/IMPALA-5450
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Jim Halfpenny
>Priority: Minor
>
> In previous versions on Impala HTTP requests to http://impalad:25000/varz?raw 
> returned the command line options used for impalad. On 2.7 it instead returns 
> the same HTML as /varz, but with Content-Type set to text/plain. It's 
> possible the command line options were removed for security reasons. If so 
> returning a blank document would be preferable to returning the normal page 
> HTML.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5377) IMPALAD Crashed With the impala starting large number of JDBC accessing

2017-06-05 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5377.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/f3fdd4d4a8025bab1d2babe9772252f1703a60ee

> IMPALAD   CrashedWith the impala starting   large number of JDBC accessing
> --
>
> Key: IMPALA-5377
> URL: https://issues.apache.org/jira/browse/IMPALA-5377
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
> Environment: Apache impala branch  
> eb54287fb4c635c8fc6c96872e87ad5a98b16339 
>Reporter: yyzzjj
>Assignee: Henry Robinson
>Priority: Critical
> Fix For: Impala 2.10.0
>
>
>  from the symptom point of view   like this
> query access  before  ExecEnv::StartServices() which  init mem_tracker_
> (gdb) bt
> #0  impala::MemTracker::CheckLimitExceeded (this=0x0) at 
> /export/ldb/online/impala_master/be/src/runtime/mem-tracker.h:331
> #1  impala::MemTracker::LimitExceeded (this=0x0) at 
> /export/ldb/online/impala_master/be/src/runtime/mem-tracker.h:234
> #2  impala::QueryState::Init (this=this@entry=0x973e400, rpc_params=...) at 
> /export/ldb/online/impala_master/be/src/runtime/query-state.cc:98
> #3  0x00cf550a in impala::QueryExecMgr::StartQuery (this=0xb137aa0, 
> params=...) at 
> /export/ldb/online/impala_master/be/src/runtime/query-exec-mgr.cc:51
> #4  0x00d7e020 in impala::ImpalaInternalService::ExecQueryFInstances 
> (this=0xa07bf00, return_val=..., params=...) at 
> /export/ldb/online/impala_master/be/src/service/impala-internal-service.cc:50
> #5  0x00fcbfb4 in 
> impala::ImpalaInternalServiceProcessor::process_ExecQueryFInstances 
> (this=0xbced020, seqid=1, iprot=, oprot=0x96c8e40, 
> connectionContext=)
> at 
> /export/ldb/online/impala_master/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:1433
> #6  0x00fcb326 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall (this=0xbced020, 
> iprot=0x96c8e70, oprot=0x96c8e40, fname=..., seqid=1, 
> connectionContext=0xbf138d0)
> at 
> /export/ldb/online/impala_master/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:1403
> #7  0x008c52cc in apache::thrift::TDispatchProcessor::process 
> (this=0xbced020, in=..., out=..., connectionContext=0xbf138d0)
> at 
> /export/ldb/online/impala_master/thirdparty/fbthrift-2016.12.19.00/build/include/thrift/lib/cpp/TDispatchProcessor.h:124
> #8  0x7f05f6c1901f in apache::thrift::server::TThreadedServer::Task::run 
> (this=0xbf13880) at server/TThreadedServer.cpp:65
> #9  0x7f05f6c25594 in 
> apache::thrift::concurrency::PthreadThread::threadMain (arg=) 
> at concurrency/PosixThreadFactory.cpp:194
> #10 0x003cd3c079d1 in start_thread () from /lib64/libpthread.so.0
> #11 0x003cd34e886d in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5433) Mark Status c'tors as explicit

2017-06-05 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5433.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/f0065d376f9a0cfbefc20164c87137df56363166

> Mark Status c'tors as explicit
> --
>
> Key: IMPALA-5433
> URL: https://issues.apache.org/jira/browse/IMPALA-5433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> {{Status}} has lots of constructors. Marking them as explicit will help avoid 
> unexpected programming errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5441) Send larger row batches over the wire

2017-06-05 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5441:
--

 Summary: Send larger row batches over the wire
 Key: IMPALA-5441
 URL: https://issues.apache.org/jira/browse/IMPALA-5441
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson


Our on-the-wire row batch size is the same as the in-memory size (1024 rows by 
default). It might make sense to increase the wire-size to reduce the 
RPC-per-row overhead, and decrease context-switching in the receiver. 

KRPC makes it quite natural to do that: each row batch can be serialized as a 
sidecar in memory-size batches. The receiver can then read each batch in turn 
as though it were sent individually, without any need to stitch together (or 
split up) serialized batches. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5433) Mark Status c'tors as explicit

2017-06-04 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5433:
--

 Summary: Mark Status c'tors as explicit
 Key: IMPALA-5433
 URL: https://issues.apache.org/jira/browse/IMPALA-5433
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor


{{Status}} has lots of constructors. Marking them as explicit will help avoid 
unexpected programming errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5350) Build threads should include fragment ID in their names

2017-06-01 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5350.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

https://github.com/apache/incubator-impala/commit/9caea9bfad025274762642a03cb5483625d86a09

> Build threads should include fragment ID in their names
> ---
>
> Key: IMPALA-5350
> URL: https://issues.apache.org/jira/browse/IMPALA-5350
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: Impala 2.10.0
>
>
> Just like fragment executor threads do, the build threads should include the 
> fragment instance ID in their name so that it's easy to map entries in 
> {{/threadz}} back onto the query and fragment they belong to without a 
> debugger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5364) Number of fragments reported in the web-ui is incorrect

2017-06-01 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5364.

   Resolution: Fixed
Fix Version/s: Impala 2.9.0

https://github.com/apache/incubator-impala/commit/64e8538ab2a45794821fcf7b84160fdc334e6505

> Number of fragments reported in the web-ui is incorrect 
> 
>
> Key: IMPALA-5364
> URL: https://issues.apache.org/jira/browse/IMPALA-5364
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
>Priority: Minor
>  Labels: supportability, web-ui
> Fix For: Impala 2.9.0
>
> Attachments: Screen Shot 2017-05-24 at 5.07.33 PM.png
>
>
> Number of fragments perf backend reported in the web-ui is incorrect. 
> It appears to be displaying the number of queries instead, during that 
> snapshot there was +1K fragments running on the cluster. 
> {code}
> Query Locations
> Location  Number of Fragments
> s-11.foo.com:2200015
> s-12.foo.com:2200015
> s-10.foo.com:2200015
> s-14.foo.com:2200015
> s-16.foo.com:2200015
> s-13.foo.com:2200015
> s-09.foo.com:2200015
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5405) Catalog will not send full update of catalog topic when statestore restarts

2017-05-31 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5405:
--

 Summary: Catalog will not send full update of catalog topic when 
statestore restarts
 Key: IMPALA-5405
 URL: https://issues.apache.org/jira/browse/IMPALA-5405
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Henry Robinson


If:

* No DDL operations have happened since the last cluster restart
* The statestore is restarted

The catalog will not re-publish its metadata topic. Any new Impala daemons 
won't get updates, and won't be able to accept queries.

For a minimal repro, start a cluster. Wait for metadata to be loaded (i.e. you 
can run a query), and then restart the statestore. After 30s or so, check 
{{/topics}} on the statestore's UI - the {{catalog-update}} topic will exist, 
but will have 0 entries.

The bug appears to be in [this 
code|https://github.com/apache/incubator-impala/blob/master/be/src/catalog/catalog-server.cc#L230]
 in the catalog:

{code}
if (delta.from_version == 0 && delta.to_version == 0 &&
  catalog_objects_min_version_ != 0) {
catalog_topic_entry_keys_.clear();
last_sent_catalog_version_ = 0L;
  } else {
// .. publish intermediate update
  }
{code}

When the statestore restarts and sends the first topic update for the catalog 
topic, {{catalog_min_update_}} may be {{0}}, so the first branch which is for 
publishing the complete metadata topic is not taken. If any DDL operations have 
happened on the cluster, {{catalog_min_update_}} becomes non-zero, and the bug 
is no longer hit.

*Workaround* Either a) Trigger metadata publication by running {{INVALIDATE 
METADATA}} or {{REFRESH }}, or b) restart {{catalogd}}.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5367) PlannerTest failure when upgrading from Ubuntu 14.04 to 16.04

2017-05-24 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5367.

Resolution: Duplicate

This is probably a duplicate of IMPALA-5358. Feel free to reopen if not 
addressed by the patch already in flight for that jira.

> PlannerTest failure when upgrading from Ubuntu 14.04 to 16.04
> -
>
> Key: IMPALA-5367
> URL: https://issues.apache.org/jira/browse/IMPALA-5367
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Jim Apple
>Assignee: Alexander Behm
>
> The following test fails when upgrading from Ubuntu 14.04 to 16.04, even with 
> a fresh data load:
> {noformat}
> testTableSample(org.apache.impala.planner.PlannerTest)  Time elapsed: 0.055 
> sec  <<< FAILURE!
> java.lang.AssertionError: 
> Section PLAN of query:
> select * from functional.alltypes tablesample system(50) repeatable(1234)
> where year = 2009
> Actual does not match expected result:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B
> |
> 00:SCAN HDFS [functional.alltypes]
>partitions=7/24 files=7 size=138.28KB
> 
>table stats: 7300 rows total
>column stats: all
>mem-estimate=48.00MB mem-reservation=0B
>tuple-ids=0 row-size=97B cardinality=1825
> Expected:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B
> |
> 00:SCAN HDFS [functional.alltypes]
>partitions=6/24 files=6 size=119.04KB
>table stats: 7300 rows total
>column stats: all
>mem-estimate=48.00MB mem-reservation=0B
>tuple-ids=0 row-size=97B cardinality=1825
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5358) Off-by-one error in testTableSample

2017-05-24 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5358:
--

 Summary: Off-by-one error in testTableSample
 Key: IMPALA-5358
 URL: https://issues.apache.org/jira/browse/IMPALA-5358
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Alexander Behm
Priority: Critical


Number of partitions scanned is 12, but expected to be 13. This was in a build 
with legacy aggs and joins enabled, but it's not obvious if those are related.

{code}
FAILED:  org.apache.impala.planner.PlannerTest.testTableSample

Error Message:

Section PLAN of query:
select * from functional.alltypes tablesample system(50) repeatable(1234)

Actual does not match expected result:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=13/24 files=13 size=258.44KB
^^
   table stats: 7300 rows total
   column stats: all
   mem-estimate=96.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=3650

Expected:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=12/24 files=12 size=240.27KB
   table stats: 7300 rows total
   column stats: all
   mem-estimate=80.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=3650

Verbose plan:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
  PLAN-ROOT SINK
  |  mem-estimate=0B mem-reservation=0B
  |
  00:SCAN HDFS [functional.alltypes]
 partitions=13/24 files=13 size=258.44KB
 table stats: 7300 rows total
 column stats: all
 mem-estimate=96.00MB mem-reservation=0B
 tuple-ids=0 row-size=97B cardinality=3650

Section PLAN of query:
select * from functional.alltypes tablesample system(50) repeatable(1234)
where id < 10

Actual does not match expected result:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=13/24 files=13 size=258.44KB
^^
   predicates: id < 10
   table stats: 7300 rows total
   column stats: all
   parquet dictionary predicates: id < 10
   mem-estimate=96.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=365

Expected:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=12/24 files=12 size=239.26KB
   predicates: id < 10
   table stats: 7300 rows total
   column stats: all
   parquet dictionary predicates: id < 10
   mem-estimate=80.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=365

Verbose plan:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
  PLAN-ROOT SINK
  |  mem-estimate=0B mem-reservation=0B
  |
  00:SCAN HDFS [functional.alltypes]
 partitions=13/24 files=13 size=258.44KB
 predicates: id < 10
 table stats: 7300 rows total
 column stats: all
 parquet dictionary predicates: id < 10
 mem-estimate=96.00MB mem-reservation=0B
 tuple-ids=0 row-size=97B cardinality=365


Stack Trace:
java.lang.AssertionError:
Section PLAN of query:
select * from functional.alltypes tablesample system(50) repeatable(1234)

Actual does not match expected result:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=13/24 files=13 size=258.44KB
^^
   table stats: 7300 rows total
   column stats: all
   mem-estimate=96.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=3650

Expected:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=12/24 files=12 size=240.27KB
   table stats: 7300 rows total
   column stats: all
   mem-estimate=80.00MB mem-reservation=0B
   tuple-ids=0 row-size=97B cardinality=3650

Verbose plan:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
  PLAN-ROOT SINK
  |  mem-estimate=0B mem-reservation=0B
  |
  00:SCAN HDFS [functional.alltypes]
 partitions=13/24 files=13 size=258.44KB
 table stats: 7300 rows total
 column stats: all
 mem-estimate=96.00MB mem-reservation=0B
 tuple-ids=0 row-size=97B cardinality=3650

Section PLAN of query:
select * from functional.alltypes tablesample system(50) repeatable(1234)
where id < 10

Actual does not match expected result:
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
00:SCAN HDFS [functional.alltypes]
   partitions=13/24 files=13 size=258.44KB
^^
   predicates: id < 10
   

[jira] [Created] (IMPALA-5350) Build threads should include fragment ID in their names

2017-05-22 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5350:
--

 Summary: Build threads should include fragment ID in their names
 Key: IMPALA-5350
 URL: https://issues.apache.org/jira/browse/IMPALA-5350
 Project: IMPALA
  Issue Type: Improvement
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor


Just like fragment executor threads do, the build threads should include the 
fragment instance ID in their name so that it's easy to map entries in 
{{/threadz}} back onto the query and fragment they belong to without a debugger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5349) BufferedBlockMgrTest.NoDirsAllocationError failed to write earlier than expected

2017-05-22 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5349:
--

 Summary: BufferedBlockMgrTest.NoDirsAllocationError failed to 
write earlier than expected
 Key: IMPALA-5349
 URL: https://issues.apache.org/jira/browse/IMPALA-5349
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Tim Armstrong


This is an ASAN build, which may affect timing:

{code}
02:48:15 [ RUN  ] BufferedBlockMgrTest.NoDirsAllocationError
02:48:15 
/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/buffered-block-mgr-test.cc:1239:
 Failure
02:48:15 Value of: status_.ok()
02:48:15   Actual: false
02:48:15 Expected: true
02:48:15 Error: Could not create files in any configured scratch directories 
(--scratch_dirs). See logs for previous errors that may have prevented creating 
or writing scratch files.
02:48:15 Opening 
'/tmp/buffered-block-mgr-test.0/impala-scratch/0:0_c0071fba-577c-4c59-95be-a42db2c34721'
 for write failed with errno=13 description=Error(13): Permission denied
02:48:15 Opening 
'/tmp/buffered-block-mgr-test.1/impala-scratch/0:0_0d6b4e50-fb7f-44e8-8b59-76a6fd8d56dd'
 for write failed with errno=13 description=Error(13): Permission denied
02:48:15 
02:48:15 *** Check failure stack trace: ***
02:48:15 @  0x2c3cc26  google::DumpStackTraceAndExit()
02:48:15 @  0x2c3361d  google::LogMessage::Fail()
02:48:15 @  0x2c34ec2  google::LogMessage::SendToLog()
02:48:15 @  0x2c32ff7  google::LogMessage::Flush()
02:48:15 @  0x2c365be  google::LogMessageFatal::~LogMessageFatal()
02:48:15 @  0x11a4999  impala::BufferedBlockMgr::~BufferedBlockMgr()
02:48:15 @  0x11ae5ca  std::_Sp_counted_ptr<>::_M_dispose()
02:48:15 @  0x10bc8b5  std::_Sp_counted_base<>::_M_release()
02:48:15 @  0x10b0414  std::__shared_ptr<>::reset()
02:48:15 @  0x127fcad  impala::RuntimeState::ReleaseResources()
02:48:15 @  0x1238348  impala::TestEnv::TearDownQueries()
02:48:15 @  0x10b4f11  impala::BufferedBlockMgrTest::TearDown()
02:48:15 @  0x2caf923  
testing::internal::HandleExceptionsInMethodIfSupported<>()
02:48:15 @  0x2ca7249  testing::Test::Run()
02:48:15 @  0x2ca73c8  testing::TestInfo::Run()
02:48:15 @  0x2ca74a5  testing::TestCase::Run()
02:48:15 @  0x2ca8728  
testing::internal::UnitTestImpl::RunAllTests()
02:48:15 @  0x2ca8a03  testing::UnitTest::Run()
02:48:15 @  0x10a8827  main
02:48:15 @   0x35d0e1ecdd  (unknown)
02:48:15 @   0xfb8d45  (unknown)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5345) Under stress, some TransmitData() RPCs are not responded to

2017-05-20 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5345:
--

 Summary: Under stress, some TransmitData() RPCs are not responded 
to
 Key: IMPALA-5345
 URL: https://issues.apache.org/jira/browse/IMPALA-5345
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.10.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Critical


Under stress conditions on two separate clusters (one secure, one not), I've 
seen some {{TransmitData()}} RPCs stay unresponded to forever, blocking the 
query's completion. The RPCs are seen by the recipient, but are not in the 
pending sender list.

Need to test further to see if this is related to the fix for IMPALA-5093 or if 
a response is dropped on some path if an row batch is 'retried' from the 
pending sender list. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5138) Running 32 concurrent queries from TPC-DS Q31 caused a crash in "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...)

2017-05-20 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5138.

   Resolution: Duplicate
Fix Version/s: Impala 2.10.0

Seems to be the same root cause as IMPALA-5093.

> Running 32 concurrent queries from TPC-DS Q31 caused a crash in 
> "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, 
> tuple=0x7f15aa008000, string_slots=...)   buffered-tuple-stream.cc:840"
> ---
>
> Key: IMPALA-5138
> URL: https://issues.apache.org/jira/browse/IMPALA-5138
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> 32 concurrent queries from TPC-DS Q31 against a 138 node cluster caused a 
> crash on the coordinator node 
> {code}
> (gdb) bt
> #0  0x0037dce32625 in raise () from /lib64/libc.so.6
> #1  0x0037dce33e05 in abort () from /lib64/libc.so.6
> #2  0x7f1b61179a55 in os::abort(bool) () from 
> /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f1b612f9f87 in VMError::report_and_die() () from 
> /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f1b6117e96f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #5  
> #6  0x0037dce89a97 in memcpy () from /lib64/libc.so.6
> #7  0x00f4f68c in impala::BufferedTupleStream::CopyStrings 
> (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:840
> #8  0x00f4fd75 in DeepCopyInternal (this=0x7f182c9b4440, 
> row=0x7f17d7ec4b08)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:815
> #9  impala::BufferedTupleStream::DeepCopy (this=0x7f182c9b4440, 
> row=0x7f17d7ec4b08)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:752
> #10 0x00e751af in AddRow (this=0x7f171fcdc1c0, stream=0x7f182c9b4440, 
> row=0x7f17d7ec4b08, status=0x7f1521a9e780)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.inline.h:30
> #11 impala::PhjBuilder::AppendRowStreamFull (this=0x7f171fcdc1c0, 
> stream=0x7f182c9b4440, row=0x7f17d7ec4b08, status=0x7f1521a9e780)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:281
> #12 0x7f18e5422316 in impala::PhjBuilder::ProcessBuildBatch ()
> #13 0x00e76351 in impala::PhjBuilder::Send (this=0x7f171fcdc1c0, 
> state=Unhandled dwarf expression opcode 0xf3
> )
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:174
> #14 0x00e5f673 in 
> impala::BlockingJoinNode::SendBuildInputToSink (this=0x7f1580c44700, 
> state=0x7f159c82f100, build_sink=
> 0x7f171fcdc1c0) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:287
> #15 0x00e5def7 in impala::BlockingJoinNode::ProcessBuildInputAsync 
> (this=0x7f1580c44700, state=0x7f159c82f100, build_sink=0x7f171fcdc1c0,
> status=0x7f1524ba0b50) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:154
> ---Type  to continue, or q  to quit---
> #16 0x00d59509 in operator() (name=Unhandled dwarf expression opcode 
> 0xf3
> )
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/function/function_template.hpp:767
> #17 impala::Thread::SuperviseThread (name=Unhandled dwarf expression opcode 
> 0xf3
> ) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/util/thread.cc:325
> #18 0x00d59f54 in operator()&, 
> const std::basic_string&, boost::function, impala::Promise int>*), boost::_bi::list0> (this=0x7f15aa6f3a00)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind.hpp:457
> #19 operator() (this=0x7f15aa6f3a00)
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind_template.hpp:20
> #20 boost::detail::thread_data std::basic_string, std::allocator >&, 
> const std::basic_string, std::allocator 
> >&, boost::function, impala::Promise*), 
> boost::_bi::list4 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost

[jira] [Resolved] (IMPALA-5136) Running 48 concurrent Q17 queries against TPC-DS 1TB queries fail with Cannot process row that is bigger than the IO size (row_size=1.55 GB, null_indicators_size=0)

2017-05-20 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5136.

   Resolution: Duplicate
Fix Version/s: Impala 2.10.0

Seems to be same root cause as IMPALA-5093.

> Running 48 concurrent Q17 queries against TPC-DS 1TB queries fail with Cannot 
> process row that is bigger than the IO size (row_size=1.55 GB, 
> null_indicators_size=0)
> 
>
> Key: IMPALA-5136
> URL: https://issues.apache.org/jira/browse/IMPALA-5136
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Assignee: Henry Robinson
> Fix For: Impala 2.10.0
>
>
> Running 48 concurrent queries from TPC-DS Q17 against a 16 node cluster 
> queries failed with 
>  Cannot process row that is bigger than the IO size (row_size=1.02 GB, 
> null_indicators_size=0). To run this query, increase the IO size (--read_size 
> option).
> Cannot process row that is bigger than the IO size (row_size=1.55 GB, 
> null_indicators_size=0). To run this query, increase the IO size (--read_size 
> option).
> Other iterations of the query failed with 
> {code}
> Remote error: Service unavailable: ReportExecStatus request on 
> impala.ExecControlService from 10.17.193.20:55530 dropped due to 
> backpressure. The service queue is full; it has 1024 items.
> Timed out: ReportExecStatus RPC to 10.17.193.10:22000 timed out after 10.000s 
> (SENT)
> {code}
> {code}
> select i_item_id ,i_item_desc ,s_state ,count(ss_quantity) as 
> store_sales_quantitycount ,avg(ss_quantity) as store_sales_quantityave 
> ,stddev_samp(ss_quantity) as store_sales_quantitystdev 
> ,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov 
> ,count(sr_return_quantity) as store_returns_quantitycount 
> ,avg(sr_return_quantity) as store_returns_quantityave 
> ,stddev_samp(sr_return_quantity) as store_returns_quantitystdev 
> ,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as 
> store_returns_quantitycov ,count(cs_quantity) as catalog_sales_quantitycount 
> ,avg(cs_quantity) as catalog_sales_quantityave ,stddev_samp(cs_quantity) as 
> catalog_sales_quantitystdev ,stddev_samp(cs_quantity)/avg(cs_quantity) as 
> catalog_sales_quantitycov from store_sales ,store_returns ,catalog_sales 
> ,date_dim d1 ,date_dim d2 ,date_dim d3 ,store ,item where d1.d_quarter_name = 
> '2000Q1' and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and 
> s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = 
> sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = 
> d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3') and 
> sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and 
> cs_sold_date_sk = d3.d_date_sk and d3.d_quarter_name in 
> ('2000Q1','2000Q2','2000Q3') group by i_item_id ,i_item_desc ,s_state order 
> by i_item_id ,i_item_desc ,s_state limit 100
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5093) Rare failure to decode LZ4 batch

2017-05-20 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5093.

   Resolution: Fixed
Fix Version/s: Impala 2.10.0

We tracked this down to a lifecycle problem: outgoing sidecars were destroyed 
by {{Close()}} before the RPC layer had a chance to finish sending them. The 
fix (for now, while we work on the larger issue of buffer lifetimes in 
KUDU-2011) is to share ownership of the buffer through the {{RpcSidecar}}.

With this fix, we were able to run a stress test on 7 nodes for over 24 hours 
with no crashes, where before the test would fail within a few minutes.

> Rare failure to decode LZ4 batch
> 
>
> Key: IMPALA-5093
> URL: https://issues.apache.org/jira/browse/IMPALA-5093
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Critical
> Fix For: Impala 2.10.0
>
>
> KRPC sometimes hits this {{DCHECK}}
> https://github.com/henryr/Impala/blob/krpc/be/src/runtime/row-batch.cc#L108
> which indicates that {{Lz4Compress::ProcessBlock}} has failed to decompress 
> the incoming row batch. Not much clarity about how this happens yet.
> Stack trace:
> {code}
> 6  0x02c7598e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x017914ba in impala::RowBatch::RowBatch (this=0x3d8af3c0, 
> row_desc=..., input_batch=..., mem_tracker=0x13ad1c80) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/row-batch.cc:108
> #8  0x0174c655 in impala::DataStreamRecvr::SenderQueue::AddBatch 
> (this=0xc962800, payload=...) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-recvr.cc:210
> #9  0x0174e13a in impala::DataStreamRecvr::AddBatch (this=0xcdda580, 
> payload=...) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-recvr.cc:352
> #10 0x0173f076 in impala::DataStreamMgr::AddData (this=0xe4a0b20, 
> fragment_instance_id=..., payload=...) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/data-stream-mgr.cc:190
> #11 0x018e8c63 in impala::DataStreamService::TransmitData 
> (this=0xdb357c0, request=0x4338c3f0, response=0xd802c00, context=0x11d27b60) 
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-internal-service.cc:77
> #12 0x018ed74e in 
> _ZZN6impala19DataStreamServiceIfC4ERK13scoped_refptrIN4kudu12MetricEntityEERKS1_INS2_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE0_clESG_SH_SJ_
>  ()
> at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala_internal_service.service.cc:157
> #13 0x018eff3b in std::_Function_handler google::protobuf::Message*, google::protobuf::Message*, 
> kudu::rpc::RpcContext*), 
> impala::DataStreamServiceIf::DataStreamServiceIf(const 
> scoped_refptr&, const 
> scoped_refptr&):: google::protobuf::Message*, google::protobuf::Message*, 
> kudu::rpc::RpcContext*)> >::_M_invoke(const std::_Any_data &, const 
> google::protobuf::Message *, google::protobuf::Message *, 
> kudu::rpc::RpcContext *) (__functor=..., __args#0=0x4338c3f0,
> __args#1=0xd802c00, __args#2=0x11d27b60) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/gcc-4.9.2/include/c++/4.9.2/functional:2039
> #14 0x01d9fcb4 in std::function google::protobuf::Message*, google::protobuf::Message*, 
> kudu::rpc::RpcContext*)>::operator()(const google::protobuf::Message *, 
> google::protobuf::Message *, kudu::rpc::RpcContext *) const (this=0xeb320b8,
> __args#0=0x4338c3f0, __args#1=0xd802c00, __args#2=0x11d27b60) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/gcc-4.9.2/include/c++/4.9.2/functional:2439
> #15 0x01d9f6b7 in kudu::rpc::GeneratedServiceIf::Handle 
> (this=0xdb357c0, call=0xcf37480) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/kudu/rpc/service_if.cc:134
> #16 0x016abfb8 in impala::ImpalaServicePool::RunThread 
> (this=0xe85ac80) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/impala-service-pool.cc:130
> #17 0x016ab5db in 
> impala::ImpalaServicePooloperator()(void) const 
> (__closure=0x7f5e11a86be8) at 
> /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/impala-service-pool.cc:68
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5174) Suppress kudu flags that aren't relevant to Impala

2017-05-17 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5174.

   Resolution: Fixed
Fix Version/s: Impala 2.9.0

Fixed in 
https://github.com/apache/incubator-impala/commit/d1910a39fcc50ce211b95c3552c0c90b4bc37bbd

which brings in this gflags commit:

https://github.com/henryr/gflags/commit/9ae8eae9a1b6162026854a5266d4ee1427c6d168



> Suppress kudu flags that aren't relevant to Impala
> --
>
> Key: IMPALA-5174
> URL: https://issues.apache.org/jira/browse/IMPALA-5174
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.9.0
>
>
> Kudu's util libraries declare quite a few flags, some of which are irrelevant 
> to Impala (as they exist in code that isn't actually used). If possible, we 
> should figure out a way to suppress them from {{--help}} and {{/varz}} to 
> avoid user confusion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-5228) test_coordinators custom cluster test fails after rebase

2017-04-20 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-5228.

Resolution: Fixed

This is fixed in the latest KRPC test runs.

> test_coordinators custom cluster test fails after rebase
> 
>
> Key: IMPALA-5228
> URL: https://issues.apache.org/jira/browse/IMPALA-5228
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.9.0
>
>
> Need to fix {{test_coordinators}} after rebase of KRPC on top of IMPALA-4041.
> {code}
> 23:13:29 === FAILURES 
> ===
> 23:13:29 _ TestCoordinators.test_multiple_coordinators 
> __
> 23:13:29 
> 23:13:29 self = 
> 23:13:29 
> 23:13:29 @pytest.mark.execute_serially
> 23:13:29 def test_multiple_coordinators(self):
> 23:13:29   """Test a cluster configuration in which not all impalad nodes 
> are coordinators.
> 23:13:29   Verify that only coordinators can accept client 
> connections and that select and DDL
> 23:13:29   queries run successfully."""
> 23:13:29 
> 23:13:29   db_name = "TEST_MUL_COORD_DB"
> 23:13:29 > self._start_impala_cluster([], num_coordinators=2, 
> cluster_size=3)
> 23:13:29 
> 23:13:29 custom_cluster/test_coordinators.py:32: 
> 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ _ _ _ _ _ 
> 23:13:29 common/custom_cluster_test_suite.py:119: in _start_impala_cluster
> 23:13:29 check_call(cmd + options, close_fds=True)
> 23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ _ _ _ _ _ 
> 23:13:29 
> 23:13:29 popenargs = 
> (['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
>  
> '--cluster_size=3..._dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1'],)
> 23:13:29 kwargs = {'close_fds': True}, retcode = 1
> 23:13:29 cmd = 
> ['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
>  
> '--cluster_size=3'...og_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1']
> 23:13:29 
> 23:13:29 def check_call(*popenargs, **kwargs):
> 23:13:29 """Run command with arguments.  Wait for command to 
> complete.  If
> 23:13:29 the exit code was zero then return, otherwise raise
> 23:13:29 CalledProcessError.  The CalledProcessError object will have 
> the
> 23:13:29 return code in the returncode attribute.
> 23:13:29 
> 23:13:29 The arguments are the same as for the Popen constructor.  
> Example:
> 23:13:29 
> 23:13:29 check_call(["ls", "-l"])
> 23:13:29 """
> 23:13:29 retcode = call(*popenargs, **kwargs)
> 23:13:29 cmd = kwargs.get("args")
> 23:13:29 if cmd is None:
> 23:13:29 cmd = popenargs[0]
> 23:13:29 if retcode:
> 23:13:29 >   raise CalledProcessError(retcode, cmd)
> 23:13:29 E   CalledProcessError: Command 
> '['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
>  '--cluster_size=3', '--num_coordinators=2', 
> '--log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1']' returned non-zero exit status 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5228) test_coordinators custom cluster test fails after rebase

2017-04-19 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5228:
--

 Summary: test_coordinators custom cluster test fails after rebase
 Key: IMPALA-5228
 URL: https://issues.apache.org/jira/browse/IMPALA-5228
 Project: IMPALA
  Issue Type: Sub-task
  Components: Distributed Exec
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Fix For: Impala 2.9.0


Need to fix {{test_coordinators}} after rebase of KRPC on top of IMPALA-4041.

{code}
23:13:29 === FAILURES 
===
23:13:29 _ TestCoordinators.test_multiple_coordinators 
__
23:13:29 
23:13:29 self = 
23:13:29 
23:13:29 @pytest.mark.execute_serially
23:13:29 def test_multiple_coordinators(self):
23:13:29   """Test a cluster configuration in which not all impalad nodes 
are coordinators.
23:13:29   Verify that only coordinators can accept client connections 
and that select and DDL
23:13:29   queries run successfully."""
23:13:29 
23:13:29   db_name = "TEST_MUL_COORD_DB"
23:13:29 > self._start_impala_cluster([], num_coordinators=2, 
cluster_size=3)
23:13:29 
23:13:29 custom_cluster/test_coordinators.py:32: 
23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ 
23:13:29 common/custom_cluster_test_suite.py:119: in _start_impala_cluster
23:13:29 check_call(cmd + options, close_fds=True)
23:13:29 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ 
23:13:29 
23:13:29 popenargs = 
(['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
 
'--cluster_size=3..._dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
 '--log_level=1'],)
23:13:29 kwargs = {'close_fds': True}, retcode = 1
23:13:29 cmd = 
['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
 
'--cluster_size=3'...og_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
 '--log_level=1']
23:13:29 
23:13:29 def check_call(*popenargs, **kwargs):
23:13:29 """Run command with arguments.  Wait for command to complete.  
If
23:13:29 the exit code was zero then return, otherwise raise
23:13:29 CalledProcessError.  The CalledProcessError object will have 
the
23:13:29 return code in the returncode attribute.
23:13:29 
23:13:29 The arguments are the same as for the Popen constructor.  
Example:
23:13:29 
23:13:29 check_call(["ls", "-l"])
23:13:29 """
23:13:29 retcode = call(*popenargs, **kwargs)
23:13:29 cmd = kwargs.get("args")
23:13:29 if cmd is None:
23:13:29 cmd = popenargs[0]
23:13:29 if retcode:
23:13:29 >   raise CalledProcessError(retcode, cmd)
23:13:29 E   CalledProcessError: Command 
'['/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/start-impala-cluster.py',
 '--cluster_size=3', '--num_coordinators=2', 
'--log_dir=/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests',
 '--log_level=1']' returned non-zero exit status 1
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-3955) Remove Scheduler class and rename SimpleScheduler to Scheduler

2017-04-18 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-3955.

   Resolution: Fixed
Fix Version/s: Impala 2.9.0

Fixed by 
https://github.com/apache/incubator-impala/commit/4743342da1147b09b6bc6cf0322f99026c300952

> Remove Scheduler class and rename SimpleScheduler to Scheduler
> --
>
> Key: IMPALA-3955
> URL: https://issues.apache.org/jira/browse/IMPALA-3955
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
>  Labels: newbie
> Fix For: Impala 2.9.0
>
>
> Just for code cleanliness, it would be good to get rid of the {{Scheduler}} 
> interface class. There's only one implementation of the interface 
> ({{SimpleScheduler}}), and there only ever has been one. If we ever feel it's 
> necessary to have more than one scheduler implementation, we can introduce an 
> appropriate abstraction at that point.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IMPALA-5174) Suppress kudu flags that aren't relevant to Impala

2017-04-05 Thread Henry Robinson (JIRA)
Henry Robinson created IMPALA-5174:
--

 Summary: Suppress kudu flags that aren't relevant to Impala
 Key: IMPALA-5174
 URL: https://issues.apache.org/jira/browse/IMPALA-5174
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Affects Versions: Impala 2.9.0
Reporter: Henry Robinson
Assignee: Henry Robinson


Kudu's util libraries declare quite a few flags, some of which are irrelevant 
to Impala (as they exist in code that isn't actually used). If possible, we 
should figure out a way to suppress them from {{--help}} and {{/varz}} to avoid 
user confusion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IMPALA-4758) Upgrade gutil to recent Kudu version

2017-03-28 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson resolved IMPALA-4758.

   Resolution: Fixed
Fix Version/s: Impala 2.9.0

Fixed by two commits:

https://github.com/apache/incubator-impala/commit/02f3e3fcc1c58bcaf5080ddee939c9081412a553
https://github.com/apache/incubator-impala/commit/23100102c0a9a8f3a8a7ff069cbfaa7a56628238

> Upgrade gutil to recent Kudu version
> 
>
> Key: IMPALA-4758
> URL: https://issues.apache.org/jira/browse/IMPALA-4758
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: Impala 2.9.0
>
>
> The gutil library that we share with Kudu has changed a bit, and needs to be 
> updated before we import the KRPC / util libraries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >