[jira] [Created] (IMPALA-9726) Update boilerplate in the PyPI sidebar for impala-shell supported versions

2020-05-05 Thread David Knupp (Jira)
David Knupp created IMPALA-9726:
---

 Summary: Update boilerplate in the PyPI sidebar for impala-shell 
supported versions
 Key: IMPALA-9726
 URL: https://issues.apache.org/jira/browse/IMPALA-9726
 Project: IMPALA
  Issue Type: Sub-task
  Components: Clients
Affects Versions: Impala 4.0
Reporter: David Knupp


The following lines need to be updated to reflect that the shell now supports 
python 2.7+ and 3+.

https://github.com/apache/impala/blob/master/shell/packaging/setup.py#L164-167
{noformat}
'Programming Language :: Python :: 2 :: Only',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
{noformat}

Note that this has no effect on the actual installation. This line is what 
manages that, and its value is correct for both Impala 3.4.0 and Impala 4.0:
https://github.com/apache/impala/blob/master/shell/packaging/setup.py#L138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9719) Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py

2020-05-05 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9719.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py
> --
>
> Key: IMPALA-9719
> URL: https://issues.apache.org/jira/browse/IMPALA-9719
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> Needed for python 3 compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9720) Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py

2020-05-05 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9720.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py
> ---
>
> Key: IMPALA-9720
> URL: https://issues.apache.org/jira/browse/IMPALA-9720
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> This is needed for python 3 compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9718) Remove pkg_resources.py from Impala/shell

2020-05-05 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9718.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Remove pkg_resources.py from Impala/shell
> -
>
> Key: IMPALA-9718
> URL: https://issues.apache.org/jira/browse/IMPALA-9718
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> pkg_resources is available in the stdlib. There should be no need to bundle 
> it with the shell.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9721) Fix python 3 compatibility regression in impala-shell

2020-05-04 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9721.
-
Resolution: Fixed

> Fix python 3 compatibility regression in impala-shell
> -
>
> Key: IMPALA-9721
> URL: https://issues.apache.org/jira/browse/IMPALA-9721
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> The fix for IMPALA-9398 introduced a small regression with regard to python 3 
> compatibility. We don't have python 3 tests yet to catch regression of this 
> this type, and it was missed in code review.
> The regression happens in two places. An example is:
> https://github.com/apache/impala/blob/master/shell/impala_shell.py#L248
> The syntax for catching exceptions has changed in python 3 to require the 
> "as" keyword.
> {noformat}
> try:
>   do_stuff()
> except  as e:
>   panic()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9724) Setup python3 compile or else 'pylint --py3k' checking on jenkins.impala.io

2020-05-04 Thread David Knupp (Jira)
David Knupp created IMPALA-9724:
---

 Summary: Setup python3 compile or else 'pylint --py3k' checking on 
jenkins.impala.io
 Key: IMPALA-9724
 URL: https://issues.apache.org/jira/browse/IMPALA-9724
 Project: IMPALA
  Issue Type: Sub-task
Affects Versions: Impala 4.0
Reporter: David Knupp


Until we get python3 testing integrated into the actual mini-cluster stack, we 
should be able to add {{pylint --py3k}} check to the upstream build pipeline, 
similar to how we used to check for python 2.6 compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9648) Exclude or update netty jar

2020-05-02 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9648.
-
Resolution: Fixed

> Exclude or update netty jar
> ---
>
> Key: IMPALA-9648
> URL: https://issues.apache.org/jira/browse/IMPALA-9648
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: David Knupp
>Priority: Major
>
> Add exclusion for netty if not being used. Or update it to version 4.1.44 or 
> later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9721) Fix python 3 compatibility regression

2020-05-01 Thread David Knupp (Jira)
David Knupp created IMPALA-9721:
---

 Summary: Fix python 3 compatibility regression
 Key: IMPALA-9721
 URL: https://issues.apache.org/jira/browse/IMPALA-9721
 Project: IMPALA
  Issue Type: Bug
  Components: Clients
Reporter: David Knupp
Assignee: David Knupp


The fix for IMPALA-9398 introduced a small regression with regard to python 3 
compatibility. We don't have python 3 tests yet to catch regression of this 
this type, and it was missed in code review.

The regression happens in two places. An example is:
https://github.com/apache/impala/blob/master/shell/impala_shell.py#L248

The syntax for catching exceptions has changed in python 3 to require the "as" 
keyword.
{noformat}
try:
  do_stuff()
except  as e:
  panic()
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download

2020-05-01 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9649.
-
Resolution: Fixed

> Exclude shiro-crypto-core and shiro-core jars from maven download
> -
>
> Key: IMPALA-9649
> URL: https://issues.apache.org/jira/browse/IMPALA-9649
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> These jars have known security vulnerabilities. They are included as part of 
> Sentry, and are not used by Impala directly. 
> There's a currently a plan to remove Sentry altogether, but since will 
> require non-trivial effort, until that time, let's exclude these items from 
> the maven download.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9720) Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py

2020-05-01 Thread David Knupp (Jira)
David Knupp created IMPALA-9720:
---

 Summary: Upgrade bitarray from 0.9.0 to 1.2.1 in 
Impala/shell/ext-py
 Key: IMPALA-9720
 URL: https://issues.apache.org/jira/browse/IMPALA-9720
 Project: IMPALA
  Issue Type: Sub-task
  Components: Clients
Affects Versions: Impala 4.0
Reporter: David Knupp


This is needed for python 3 compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9719) Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py

2020-05-01 Thread David Knupp (Jira)
David Knupp created IMPALA-9719:
---

 Summary: Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py
 Key: IMPALA-9719
 URL: https://issues.apache.org/jira/browse/IMPALA-9719
 Project: IMPALA
  Issue Type: Sub-task
  Components: Clients
Affects Versions: Impala 4.0
Reporter: David Knupp


Needed for python 3 compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9718) Remove pkg_resources.py from Impala/shell

2020-05-01 Thread David Knupp (Jira)
David Knupp created IMPALA-9718:
---

 Summary: Remove pkg_resources.py from Impala/shell
 Key: IMPALA-9718
 URL: https://issues.apache.org/jira/browse/IMPALA-9718
 Project: IMPALA
  Issue Type: Sub-task
  Components: Clients
Affects Versions: Impala 4.0
Reporter: David Knupp


pkg_resources is available in the stdlib. There should be no need to bundle it 
with the shell.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8608) Impala needs to support Python 3

2020-05-01 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-8608.
-
Fix Version/s: Not Applicable
   Resolution: Duplicate

> Impala needs to support Python 3
> 
>
> Key: IMPALA-8608
> URL: https://issues.apache.org/jira/browse/IMPALA-8608
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Priority: Critical
> Fix For: Not Applicable
>
>
> [The End of Python 2.7|https://pythonclock.org/] support is getting closer 
> and as such we need to be able to move to Python 3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9648) Exclude or update netty jar

2020-04-24 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9648.
-
Target Version: Impala 4.0
Resolution: Fixed

> Exclude or update netty jar
> ---
>
> Key: IMPALA-9648
> URL: https://issues.apache.org/jira/browse/IMPALA-9648
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: David Knupp
>Priority: Major
>
> Add exclusion for netty if not being used. Or update it to version 4.1.44 or 
> later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9647) Exclude or update fluent-hc jar

2020-04-24 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9647.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Exclude or update fluent-hc jar
> ---
>
> Key: IMPALA-9647
> URL: https://issues.apache.org/jira/browse/IMPALA-9647
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: David Knupp
>Priority: Blocker
> Fix For: Impala 4.0
>
>
> Add exclusion for fluent-hc-4.3.2.jar or upgrade it to 4.3.6 or later version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9489) Setup impala-shell.sh env separately, and use thrift-0.11.0 by default

2020-04-17 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9489.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Setup impala-shell.sh env separately, and use thrift-0.11.0 by default
> --
>
> Key: IMPALA-9489
> URL: https://issues.apache.org/jira/browse/IMPALA-9489
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> [Note: this JIRA was filed in relation to the ongoing effort to make the 
> impala-shell compatible with python 3]
> The impala python development environment is a fairly convoluted affair -- a 
> number of packages are installed in the infra/python/env, some of it comes 
> from the toolchain, some of it is generated and lives in the shell directory. 
> Generally speaking, if you launch impala-python and import a module, it's not 
> necessarily easy to predict where the module might live.
> {noformat}
> $ python
> Python 2.7.10 (default, Aug 17 2018, 19:45:58)
> [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import sasl
> >>> sasl
>  '/home/systest/Impala/shell/ext-py/sasl-0.1.1/dist/sasl-0.1.1-py2.7-linux-x86_64.egg/sasl/__init__.pyc'>
> >>> import requests
> >>> requests
>  '/home/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/requests/__init__.pyc'>
> >>> import Logging
> >>> Logging
>  '/home/systest/Impala/shell/gen-py/Logging/__init__.pyc'>
> >>> import thrift
> >>> thrift
>  '/home/systest/Impala/toolchain/thrift-0.9.3-p7/python/lib/python2.7/site-packages/thrift/__init__.pyc'>
> {noformat}
> Really, there is no one coherent environment -- there's just whatever 
> collection of modules happens to be available at a given time for a given 
> type of invocation, all of which is accomplished behind the scenes by calling 
> scripts like {{bin/set-pythonpath.sh}} and {{bin/impala-python-common.sh}} 
> that are responsible for cobbling together a PYTHONPATH based on known 
> locations and current env variables.
> As far as I can tell, there are three important contexts where python comes 
> into play...
> * during the build process (used during data load, e.g., 
> testdata/bin/load_nested.py)
> * when running the py.test bases e2e tests
> * whenever the impala-shell is invoked
> As noted by IMPALA-7825 (and also in a conversation I had with 
> [~stakiar_impala_496e]), we're dependent on thrift 0.9.3 during the build 
> process. This seems to come into play during the loading of test data 
> (specifically, when calling testdata/bin/load_nested.py) mainly because at 
> one point there was some well-intentioned but probably misguided attempt at 
> code reuse from the test framework. The test code that gets re-used involves 
> impyla and/or thrift-sasl, which currently still relies on thrift 0.9.3. So 
> our test framework, and by extension the build, both inherit the same 
> limitation.
> The impala-shell, on the other hand, luckily doesn't directly reuse any of 
> the same test modules, and there really is no need to keep it pinned to 
> 0.9.3. However, since calling the impala-shell.sh winds up invoking 
> {{set-pythonpath.sh}}, the same script that script sets up the environment 
> during building or testing, thrift 0.9.3 just kind of leaks over by default.
> As it turns out, thrift 0.9.3 is also one of the many limitations restricting 
> the impala-shell to python 2. Luckily, with IMPALA-7924 resolved, 
> thrift-0.11.0 is available -- we just have to use it. And the way to 
> accomplish that  is by decoupling the impala-shell from relying either 
> {{set-pythonpath.sh}} or {{impala-python-common.sh}}. 
> As a first pass, we can address the dev environment by just having 
> {{impala-shell.sh}} itself do whatever is required to find python 
> dependencies, and we can specify thrift-0.11.0 there. Also, thrift 0.11.0 
> should be used by both of the scripts used to create the tarballs that 
> package the impala-shell for customer environments. Neither of these should 
> adversely building Impala or running the py.test test framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9501) Upgrade sqlparse to a version that supports python 3.0

2020-04-16 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9501.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Upgrade sqlparse to a version that supports python 3.0
> --
>
> Key: IMPALA-9501
> URL: https://issues.apache.org/jira/browse/IMPALA-9501
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> The current version (0.1.19) was selected, per IMPALA-6999. because it's the 
> last version to be compatible with python 2.6. However, it's not compatible 
> with python 3.x.
> {noformat}
> Traceback (most recent call last):
>   File 
> "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py",
>  line 37, in 
> import sqlparse
>   File "", line 983, in _find_and_load
>   File "", line 967, in _find_and_load_unlocked
>   File "", line 668, in _load_unlocked
>   File "", line 638, in _load_backward_compatible
>   File 
> "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/__init__.py",
>  line 13, in 
>   File "", line 983, in _find_and_load
>   File "", line 967, in _find_and_load_unlocked
>   File "", line 668, in _load_unlocked
>   File "", line 638, in _load_backward_compatible
>   File 
> "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/engine/__init__.py",
>  line 8, in 
>   File "", line 983, in _find_and_load
>   File "", line 963, in _find_and_load_unlocked
>   File "", line 906, in _find_spec
>   File "", line 1280, in find_spec
>   File "", line 1254, in _get_spec
>   File "", line 1235, in 
> _legacy_get_spec
>   File "", line 441, in spec_from_loader
>   File "", line 594, in 
> spec_from_file_location
>   File 
> "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/lexer.py",
>  line 84
> except Exception, err:
> ^
> SyntaxError: invalid syntax
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9362) Update sqlparse used by impala-shell from version 0.1.19 to latest

2020-04-16 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9362.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Update sqlparse used by impala-shell from version 0.1.19 to latest
> --
>
> Key: IMPALA-9362
> URL: https://issues.apache.org/jira/browse/IMPALA-9362
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> The fix for IMPALA-6337 involved correcting the way that sqlparse, an 
> upstream 3rd party python library used by the impala-shell, parses queries 
> that contain line breaks embedded inside of double quotes. Initially, 
> Impala's internally bundled version of sqlparse (based on 0.1.19) was 
> patched; meanwhile, a pull request to get the fix into an official release 
> was submitted.
> That pull-request was finally included in the 0.3.0 version of sqlparse. 
> However, there were other changes to the library in the interim, in terms of 
> API's and also in some of the parsing logic, that breaks the impala-shell in 
> other ways, so simply migrating to the newer release is not straightforward.
> We need to find and fix all the places that the newer sqlparse breaks the 
> impala-shell, so that we can stop relying on sqlparse 0.1.19 (which, in some 
> places, is not python 3 compatible).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download

2020-04-16 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9649.
-
Resolution: Fixed

> Exclude shiro-crypto-core and shiro-core jars from maven download
> -
>
> Key: IMPALA-9649
> URL: https://issues.apache.org/jira/browse/IMPALA-9649
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 4.0
>
>
> These jars have known security vulnerabilities. They are included as part of 
> Sentry, and are not used by Impala directly. 
> There's a currently a plan to remove Sentry altogether, but since will 
> require non-trivial effort, until that time, let's exclude these items from 
> the maven download.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download

2020-04-13 Thread David Knupp (Jira)
David Knupp created IMPALA-9649:
---

 Summary: Exclude shiro-crypto-core and shiro-core jars from maven 
download
 Key: IMPALA-9649
 URL: https://issues.apache.org/jira/browse/IMPALA-9649
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 4.0
Reporter: David Knupp
 Fix For: Impala 4.0


These jars have known security vulnerabilities. They are included as part of 
Sentry, and are not used by Impala directly. 

There's a currently a plan to remove Sentry altogether, but since will require 
non-trivial effort, until that time, let's exclude these items from the maven 
download.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell

2020-04-01 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9582.
-
Resolution: Fixed

> Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
> --
>
> Key: IMPALA-9582
> URL: https://issues.apache.org/jira/browse/IMPALA-9582
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was 
> not reading all data, causing clients to hang.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell

2020-03-30 Thread David Knupp (Jira)
David Knupp created IMPALA-9582:
---

 Summary: Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
 Key: IMPALA-9582
 URL: https://issues.apache.org/jira/browse/IMPALA-9582
 Project: IMPALA
  Issue Type: Improvement
  Components: Clients
Affects Versions: Impala 3.4.0
Reporter: David Knupp
 Fix For: Impala 3.4.0


thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was not 
reading all data, causing clients to hang.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9501) Upgrade sqlparse to a version that supports python 3.0

2020-03-13 Thread David Knupp (Jira)
David Knupp created IMPALA-9501:
---

 Summary: Upgrade sqlparse to a version that supports python 3.0
 Key: IMPALA-9501
 URL: https://issues.apache.org/jira/browse/IMPALA-9501
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: David Knupp


The current version (0.1.19) was selected, per IMPALA-6999. because it's the 
last version to be compatible with python 2.6. However, it's not compatible 
with python 3.x.
{noformat}
Traceback (most recent call last):
  File 
"/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", 
line 37, in 
import sqlparse
  File "", line 983, in _find_and_load
  File "", line 967, in _find_and_load_unlocked
  File "", line 668, in _load_unlocked
  File "", line 638, in _load_backward_compatible
  File 
"/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/__init__.py",
 line 13, in 
  File "", line 983, in _find_and_load
  File "", line 967, in _find_and_load_unlocked
  File "", line 668, in _load_unlocked
  File "", line 638, in _load_backward_compatible
  File 
"/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/engine/__init__.py",
 line 8, in 
  File "", line 983, in _find_and_load
  File "", line 963, in _find_and_load_unlocked
  File "", line 906, in _find_spec
  File "", line 1280, in find_spec
  File "", line 1254, in _get_spec
  File "", line 1235, in _legacy_get_spec
  File "", line 441, in spec_from_loader
  File "", line 594, in 
spec_from_file_location
  File 
"/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/lexer.py",
 line 84
except Exception, err:
^
SyntaxError: invalid syntax
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9489) Have impala-python env and impala-shell use available thrift-0.11.0 files

2020-03-11 Thread David Knupp (Jira)
David Knupp created IMPALA-9489:
---

 Summary: Have impala-python env and impala-shell use available 
thrift-0.11.0 files
 Key: IMPALA-9489
 URL: https://issues.apache.org/jira/browse/IMPALA-9489
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.4.0
Reporter: David Knupp
Assignee: David Knupp


Apparently, we can't simply kick thrift-0.9.3 to the curb. Our build process 
needs some attention before that can happen, per IMPALA-7825, and a also 
conversation I had with [~stakiar_impala_496e]. 

However, with IMPALA-7924 resolved, we do have access to thrift-0.11.0 python 
files, and we should use those by default. It turns out that being stuck with 
thrift-0.9.3 is a major impediment to achieving python 3 compatibility for our 
python stack.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9424) Add six python library to shell/ext-py

2020-02-27 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9424.
-
Resolution: Fixed

> Add six python library to shell/ext-py
> --
>
> Key: IMPALA-9424
> URL: https://issues.apache.org/jira/browse/IMPALA-9424
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Priority: Major
>
> A couple of impala-shell changes that are coming in the near future 
> (thrift_sasl update, possible changes to THttpClient, python 3 support) will 
> require the six python library.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9424) Add six python library to shell/ext-py

2020-02-25 Thread David Knupp (Jira)
David Knupp created IMPALA-9424:
---

 Summary: Add six python library to shell/ext-py
 Key: IMPALA-9424
 URL: https://issues.apache.org/jira/browse/IMPALA-9424
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.4.0
Reporter: David Knupp


A couple of impala-shell changes that are coming in the near future 
(thrift_sasl update, possible changes to THttpClient, python 3 support) will 
require the six python library.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9362) Update sqlparse used by impala-shell from version 0.1.19 to latest

2020-02-06 Thread David Knupp (Jira)
David Knupp created IMPALA-9362:
---

 Summary: Update sqlparse used by impala-shell from version 0.1.19 
to latest
 Key: IMPALA-9362
 URL: https://issues.apache.org/jira/browse/IMPALA-9362
 Project: IMPALA
  Issue Type: Improvement
  Components: Clients
Affects Versions: Impala 3.4.0
Reporter: David Knupp


The fix for IMPALA-6337 involved correcting the way that sqlparse, an upstream 
3rd party python library used by the impala-shell, parses queries that contain 
line breaks embedded inside of double quotes. Initially, Impala's internally 
bundled version of sqlparse (based on 0.1.19) was patched; meanwhile, a pull 
request to get the fix into an official release was submitted.

That pull-request was finally included in the 0.3.0 version of sqlparse. 
However, there were other changes to the library in the interim, in terms of 
API's and also in some of the parsing logic, that breaks the impala-shell in 
other ways, so simply migrating to the newer release is not straightforward.

We need to find and fix all the places that the newer sqlparse breaks the 
impala-shell, so that we can stop relying on sqlparse 0.1.19 (which, in some 
places, is not python 3 compatible).






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9157) TestAuthorizationProvider.test_invalid_provider_flag fails due to Python 2.6 incompatible code

2019-11-26 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9157.
-
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> TestAuthorizationProvider.test_invalid_provider_flag fails due to Python 2.6 
> incompatible code
> --
>
> Key: IMPALA-9157
> URL: https://issues.apache.org/jira/browse/IMPALA-9157
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: David Knupp
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.4.0
>
>
> Our Centos 6 builds use Python 2.6, which means that it doesn't have 
> check_output (added in Python 2.7). This causes test failures in 
> test_provider.py:
>  
> {noformat}
> authorization/test_provider.py:70: in setup_method
> self.pre_test_cores = set([f for f in possible_cores if is_core_dump(f)])
> ../lib/python/impala_py_lib/helpers.py:64: in is_core_dump
> file_std_out = exec_local_command("file %s" % file_path)
> ../lib/python/impala_py_lib/helpers.py:34: in exec_local_command
> return subprocess.check_output(cmd.split())
> E   AttributeError: 'module' object has no attribute 'check_output'{noformat}
> This comes from the new code to handle intentional core dumps:
>  
> [https://github.com/apache/impala/blob/master/lib/python/impala_py_lib/helpers.py#L34]
> {noformat}
> def exec_local_command(cmd):
>   """  Executes a command for the local bash shell and return stdout as a 
> string.
>   Args:
> cmd: command as a string
>   Return:
> STDOUT
>   """
>   return subprocess.check_output(cmd.split()){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-08 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9129.
-
Resolution: Fixed

> Provide a way for negative tests to remove intentionally generated core dumps
> -
>
> Key: IMPALA-9129
> URL: https://issues.apache.org/jira/browse/IMPALA-9129
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> Occasionally, tests (esp. custom cluster tests) will inject an error or set 
> some invalid config, expecting Impala to generate a core dump.
> We should have a general way for such files to delete the bogus core dumps, 
> otherwise they can complicate/confuse later triaging efforts of legitimate 
> test failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-05 Thread David Knupp (Jira)
David Knupp created IMPALA-9129:
---

 Summary: Provide a way for negative tests to remove intentionally 
generated core dumps
 Key: IMPALA-9129
 URL: https://issues.apache.org/jira/browse/IMPALA-9129
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: David Knupp


Occasionally, tests (esp. custom cluster tests) will perform some action, 
expecting Impala to generate a core dump.

We should have a general way for such files to delete the bogus core dumps, 
otherwise they can complicate/confuse later test triaging efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-1071) Install impala-shell from PyPI

2019-10-21 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-1071.
-
Resolution: Fixed

> Install impala-shell from PyPI
> --
>
> Key: IMPALA-1071
> URL: https://issues.apache.org/jira/browse/IMPALA-1071
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 1.3.1
>Reporter: Shinya Okano
>Assignee: David Knupp
>Priority: Minor
>  Labels: shell
>
> I want to install impala-shell from PyPI (Python Package Index). 
> impala-shell appears to have been made with the Python module, a shell script 
> a little.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-8877) CatalogException: Table modified while operation was in progress, aborting execution.

2019-08-19 Thread David Knupp (Jira)
David Knupp created IMPALA-8877:
---

 Summary: CatalogException: Table  modified while operation 
was in progress, aborting execution.
 Key: IMPALA-8877
 URL: https://issues.apache.org/jira/browse/IMPALA-8877
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.3.0
Reporter: David Knupp
 Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz

This was hit while running the stress tests to get a baseline on a deployed 
cluster.

/* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */
COMPUTE STATS catalog_sales

{noformat}
Query (id=924a50178a5a6146:29d58a73)
  Summary
Session ID: 5543fb9029e2b71f:f446381b1f59ed81
Session Type: HIVESERVER2
HiveServer2 Protocol Version: V6
Start Time: 2019-08-19 01:26:07.292866000
End Time: 2019-08-19 01:26:27.248053000
Query Type: DDL
Query State: EXCEPTION
Query Status: CatalogException: Table 
'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in 
progress, aborting execution.
Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build 
df3e7c051e2641524fc53a0cd07c2a14decd55f7)
User: syst...@vpc.cloudera.com
Connected User: syst...@vpc.cloudera.com
Delegated User: 
Network Address: :::10.65.6.19:39174
Default Db: tpcds_300_decimal_parquet
Sql Statement: /* Mem: 12850 MB. Coordinator: 
quasar-mzmnbe-6.vpc.cloudera.com. */
COMPUTE STATS catalog_sales
Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000
Query Options (set by configuration): 
ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
Query Options (set by configuration and planner): 
ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
DDL Type: COMPUTE_STATS
Query Compilation
  Metadata of all 1 tables cached: 5.62s (5622372318)
  Analysis finished: 5.62s (5622560027)
  Authorization finished (noop): 5.62s (5622568284)
  Retried query planning due to inconsistent metadata 7 of 40 times: 
Catalog object TCatalogObject(type:TABLE, catalog_version:94204, 
table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) 
changed version between accesses.: 5.95s (5949859598)
  Planning finished: 5.95s (5949861145)
Query Timeline
  Query submitted: 0ns (0)
  Planning finished: 5.95s (5950024020)
  Child queries finished: 17.85s (17849072057)
  Rows available: 19.82s (19825080035)
  Unregister query: 19.95s (19955080560)
Frontend
  - CatalogFetch.ColumnStats.Misses: 34 (34)
  - CatalogFetch.ColumnStats.Requests: 34 (34)
  - CatalogFetch.ColumnStats.Time: 0 (0)
  - CatalogFetch.Config.Hits: 1 (1)
  - CatalogFetch.Config.Requests: 1 (1)
  - CatalogFetch.Config.Time: 0 (0)
  - CatalogFetch.DatabaseList.Hits: 8 (8)
  - CatalogFetch.DatabaseList.Requests: 8 (8)
  - CatalogFetch.DatabaseList.Time: 0 (0)
  - CatalogFetch.PartitionLists.Misses: 1 (1)
  - CatalogFetch.PartitionLists.Requests: 1 (1)
  - CatalogFetch.PartitionLists.Time: 7 (7)
  - CatalogFetch.Partitions.Hits: 1837 (1837)
  - CatalogFetch.Partitions.Misses: 1837 (1837)
  - CatalogFetch.Partitions.Requests: 3674 (3674)
  - CatalogFetch.Partitions.Time: 325 (325)
  - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030)
  - CatalogFetch.RPCs.Requests: 22 (22)
  - CatalogFetch.RPCs.Time: 343 (343)
  - CatalogFetch.TableNames.Hits: 4 (4)
  - CatalogFetch.TableNames.Misses: 4 (4)
  - CatalogFetch.TableNames.Requests: 8 (8)
  - CatalogFetch.TableNames.Time: 0 (0)
  - CatalogFetch.Tables.Misses: 8 (8)
  - CatalogFetch.Tables.Requests: 8 (8)
  - CatalogFetch.Tables.Time: 74 (74)
  - InactiveTotalTime: 0ns (0)
  - TotalTime: 0ns (0)
  ImpalaServer
- CatalogOpExecTimer: 1.97s (1972007962)
- ClientFetchWaitTimer: 0ns (0)
- InactiveTotalTime: 0ns (0)
- RowMaterializationTimer: 0ns (0)
- TotalTime: 0ns (0)
  Child Queries
Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45)
Column Stats Query (id=0444367557e3496d:f9435111)
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8558) test_char_format failing on deployed clusters because chars_formats table does not exist

2019-05-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-8558:
---

 Summary: test_char_format failing on deployed clusters because 
chars_formats table does not exist
 Key: IMPALA-8558
 URL: https://issues.apache.org/jira/browse/IMPALA-8558
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: David Knupp


This issue is showing up for databases functional, functional_parquet, and 
functional_orc_def. It's not immediately clear why this table wasn't created 
during data load. But for example:

*Stacktrace*
{noformat}
query_test/test_chars.py:75: in test_char_format
self.run_test_case('QueryTest/chars-formats', vector)
common/impala_test_suite.py:512: in run_test_case
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:746: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:180: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:187: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:362: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:356: in execute_query_async
handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:516: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Could not resolve table reference: 
'chars_formats'
{noformat}

*Standard Error*
{noformat}
SET 
client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text;
-- connecting to: quasar-nsgjqi-4.vpc.cloudera.com:21000
-- connecting to quasar-nsgjqi-4.vpc.cloudera.com:21050 with impyla
-- 2019-05-15 00:05:55,759 INFO MainThread: Closing active operation
SET 
client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text;
-- executing against quasar-nsgjqi-4.vpc.cloudera.com:21000
use functional;

-- 2019-05-15 00:05:55,788 INFO MainThread: Started query 
f845123ce97ab647:0560abac
SET 
client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against quasar-nsgjqi-4.vpc.cloudera.com:21000
select * from chars_formats order by vc;

{noformat}






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8556) test_large_num_partitions failing on deployed clusters because scale_db was not created

2019-05-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-8556:
---

 Summary: test_large_num_partitions failing on deployed clusters 
because scale_db was not created
 Key: IMPALA-8556
 URL: https://issues.apache.org/jira/browse/IMPALA-8556
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: David Knupp


It's not immediately clear why scale_db is not being created during data load.

catalog_service/test_large_num_partitions.py:41: in test_list_partitions

*Stacktrace*
{noformat}
catalog_service/test_large_num_partitions.py:41: in test_list_partitions
result = self.client.execute("show table stats %s" % full_tbl_name)
common/impala_connection.py:180: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:187: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:362: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:356: in execute_query_async
handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:516: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Database does not exist: scale_db
{noformat}

catalog_service/test_large_num_partitions.py:62: in 
test_predicates_on_partition_attributes

*Stacktrace*
{noformat}
catalog_service/test_large_num_partitions.py:62: in 
test_predicates_on_partition_attributes
result = self.client.execute("select * from %s where j = 1" % full_tbl_name)
common/impala_connection.py:180: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:187: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:362: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:356: in execute_query_async
handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:516: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Could not resolve table reference: 
'scale_db.num_partitions_1234_blocks_per_partition_1'
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8553) Several tests failing with connection errors on deployed clusters

2019-05-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-8553:
---

 Summary: Several tests failing with connection errors on deployed 
clusters
 Key: IMPALA-8553
 URL: https://issues.apache.org/jira/browse/IMPALA-8553
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: David Knupp
Assignee: Tim Armstrong


The errors look fairly similar. I suspect this commit introduced a regression:
https://github.com/apache/impala/commit/79c5f875

*Stacktrace*
{noformat}
metadata/test_hms_integration.py:66: in test_sanity
if IMPALA_TEST_CLUSTER_PROPERTIES.is_catalog_v2_cluster():
common/environ.py:307: in is_catalog_v2_cluster
flags = self._get_flags_from_web_ui(web_ui_url)
common/environ.py:295: in _get_flags_from_web_ui
response = requests.get(impala_url + "/varz?json")
../infra/python/env/lib/python2.7/site-packages/requests/api.py:69: in get
return request('get', url, params=params, **kwargs)
../infra/python/env/lib/python2.7/site-packages/requests/api.py:50: in request
response = session.request(method=method, url=url, **kwargs)
../infra/python/env/lib/python2.7/site-packages/requests/sessions.py:465: in 
request
resp = self.send(prep, **send_kwargs)
../infra/python/env/lib/python2.7/site-packages/requests/sessions.py:573: in 
send
r = adapter.send(request, **kwargs)
../infra/python/env/lib/python2.7/site-packages/requests/adapters.py:415: in 
send
raise ConnectionError(err, request=request)
E   ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8552) impala-shell tests break on remote clusters if IMPALA_LOCAL_BUILD_VERSION is None

2019-05-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-8552:
---

 Summary: impala-shell tests break on remote clusters if 
IMPALA_LOCAL_BUILD_VERSION is None
 Key: IMPALA-8552
 URL: https://issues.apache.org/jira/browse/IMPALA-8552
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: David Knupp
Assignee: Tim Armstrong


This is a regression introduced by the commit: 
https://github.com/apache/impala/commit/b55d905

*Stacktrace*
{noformat}
shell/test_shell_commandline.py:33: in 
from util import (get_impalad_host_port, assert_var_substitution, 
run_impala_shell_cmd,
shell/util.py:42: in 
IMPALA_HOME, "shell/build", "impala-shell-" + IMPALA_LOCAL_BUILD_VERSION,
E   TypeError: cannot concatenate 'str' and 'NoneType' objects
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-8481.
-
Resolution: Fixed

> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-01 Thread David Knupp (JIRA)
David Knupp created IMPALA-8481:
---

 Summary: test_hbase_col_filter failing on deployed clusters due to 
permissions error
 Key: IMPALA-8481
 URL: https://issues.apache.org/jira/browse/IMPALA-8481
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: David Knupp
 Fix For: Impala 3.3.0


When running test_hbase_queries against a deployed cluster, the default user on 
the machine running the tests may not have the correct access permission on the 
cluster, which causes this test to fail.
{noformat}
query_test/test_hbase_queries.py:89: in test_hbase_col_filter
self.run_stmt_in_hive(add_data)
common/impala_test_suite.py:800: in run_stmt_in_hive
raise RuntimeError(stderr)
[...]
E   INFO  : Query ID = hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
E   INFO  : Total jobs = 1
E   INFO  : Starting task [Stage-0:DDL] in serial mode
E   INFO  : Launching Job 1 out of 1
E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
E   INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
E   ERROR : Job Submission failed with exception 
'org.apache.hadoop.security.AccessControlException(Permission denied: 
user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8465) hs2.test_json_endpoints.TestJsonEndpoints fails on deployed clusters

2019-04-26 Thread David Knupp (JIRA)
David Knupp created IMPALA-8465:
---

 Summary: hs2.test_json_endpoints.TestJsonEndpoints fails on 
deployed clusters
 Key: IMPALA-8465
 URL: https://issues.apache.org/jira/browse/IMPALA-8465
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: David Knupp


Changes to tests.common.ImpalaCluster in [commit 
2ca7f8e7|https://github.com/apache/impala/commit/2ca7f8e7c0781a1914275b3506cf8a7748c44c85#diff-6fea89ad0e6c440b0373bb136d7510b5]
 introduced a regression in this test.
{noformat}
hs2/test_json_endpoints.py:51: in test_waiting_in_flight_queries
queries_json = self._get_json_queries(http_addr)
hs2/test_json_endpoints.py:33: in _get_json_queries
return cluster.impalads[0].service.get_debug_webpage_json("/queries")
E   IndexError: list index out of range
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8393) setup-ranger step in create-load-data.sh breaks data load to real clusters

2019-04-05 Thread David Knupp (JIRA)
David Knupp created IMPALA-8393:
---

 Summary: setup-ranger step in create-load-data.sh breaks data load 
to real clusters
 Key: IMPALA-8393
 URL: https://issues.apache.org/jira/browse/IMPALA-8393
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0, Impala 3.3.0
Reporter: David Knupp


{{localhost}} is hard-coded into the setup-ranger function that was recently 
added to create-load-data.sh, e.g.:

https://github.com/apache/impala/blame/master/testdata/bin/create-load-data.sh#L325

This works when testing on a mini-cluster, but breaks data load if setting up 
to run the functional test suite against an actual cluster. In that scenario, 
the host that runs the script is simply a test runner, with no locally running 
services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8346) Testcase builder for query planner introduced a data load regression on CDH clusters

2019-03-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-8346:
---

 Summary: Testcase builder for query planner introduced a data load 
regression on CDH clusters
 Key: IMPALA-8346
 URL: https://issues.apache.org/jira/browse/IMPALA-8346
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.2.0, Impala 3.3.0
Reporter: David Knupp
Assignee: bharath v


The patch to address IMPALA-5872 introduced a new script into our data load 
process. This script has been tested against the single-node mini-cluster, but 
doesn't appear to run against actual (remote) clusters.
{noformat}
Starting Impala Shell without Kerberos authentication
Opened TCP connection to remote-coordinator-node.mycompany.com:21000
Connected to remote-coordinator-node.mycompany.com:21000
Server version: impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
2e55383eb86de20499e2f9327cd5bcbda6788e50)
Query: use `tpcds`
Query: use `tpcds`
Query: COPY TESTCASE TO '/test-warehouse/tpcds-testcase-data' -- start query 1 
in stream 0 using template query11.tpl
with year_total as (
 select c_customer_id customer_id
   ,c_first_name customer_first_name
   ,c_last_name customer_last_name
   ,c_preferred_cust_flag customer_preferred_cust_flag
   ,c_birth_country customer_birth_country
   ,c_login customer_login
   ,c_email_address customer_email_address
   ,d_year dyear
   ,sum(ss_ext_list_price-ss_ext_discount_amt) year_total
   ,'s' sale_type
 from customer
 ,store_sales
 ,date_dim
 where c_customer_sk = ss_customer_sk
   and ss_sold_date_sk = d_date_sk
 group by c_customer_id
 ,c_first_name
 ,c_last_name
 ,c_preferred_cust_flag
 ,c_birth_country
 ,c_login
 ,c_email_address
 ,d_year
 union all
 select c_customer_id customer_id
   ,c_first_name customer_first_name
   ,c_last_name customer_last_name
   ,c_preferred_cust_flag customer_preferred_cust_flag
   ,c_birth_country customer_birth_country
   ,c_login customer_login
   ,c_email_address customer_email_address
   ,d_year dyear
   ,sum(ws_ext_list_price-ws_ext_discount_amt) year_total
   ,'w' sale_type
 from customer
 ,web_sales
 ,date_dim
 where c_customer_sk = ws_bill_customer_sk
   and ws_sold_date_sk = d_date_sk
 group by c_customer_id
 ,c_first_name
 ,c_last_name
 ,c_preferred_cust_flag
 ,c_birth_country
 ,c_login
 ,c_email_address
 ,d_year
 )
  select
  t_s_secyear.customer_id
 ,t_s_secyear.customer_first_name
 ,t_s_secyear.customer_last_name
 ,t_s_secyear.customer_email_address
 from year_total t_s_firstyear
 ,year_total t_s_secyear
 ,year_total t_w_firstyear
 ,year_total t_w_secyear
 where t_s_secyear.customer_id = t_s_firstyear.customer_id
 and t_s_firstyear.customer_id = t_w_secyear.customer_id
 and t_s_firstyear.customer_id = t_w_firstyear.customer_id
 and t_s_firstyear.sale_type = 's'
 and t_w_firstyear.sale_type = 'w'
 and t_s_secyear.sale_type = 's'
 and t_w_secyear.sale_type = 'w'
 and t_s_firstyear.dyear = 2001
 and t_s_secyear.dyear = 2001+1
 and t_w_firstyear.dyear = 2001
 and t_w_secyear.dyear = 2001+1
 and t_s_firstyear.year_total > 0
 and t_w_firstyear.year_total > 0
 and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total 
/ t_w_firstyear.year_total else 0.0 end
 > case when t_s_firstyear.year_total > 0 then 
t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end
 order by t_s_secyear.customer_id
 ,t_s_secyear.customer_first_name
 ,t_s_secyear.customer_last_name
 ,t_s_secyear.customer_email_address
limit 100
Query submitted at: 2019-03-23 23:40:12 (Coordinator: 
http://remote-coordinator-node.mycompany.com:25000)
ERROR: ImpalaRuntimeException: Error writing test case output to file: 
hdfs://namenode.mycompany.com:8020/test-warehouse/tpcds-testcase-data/impala-testcase-data-6430bc87-5337-4e65-b6aa-d059088f3a4b
CAUSED BY: AccessControlException: Permission denied: user=impala, 
access=WRITE, inode="/test-warehouse/tpcds-testcase-data":hdfs:hdfs:drwxr-xr-x
[...]
Could not execute command: COPY TESTCASE TO 
'/test-warehouse/tpcds-testcase-data' -- start query 1 in stream 0 using 
template query11.tpl
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8091) Kudu SIGABRT'ed during dataload on Impala release build

2019-01-17 Thread David Knupp (JIRA)
David Knupp created IMPALA-8091:
---

 Summary: Kudu SIGABRT'ed during dataload on Impala release build
 Key: IMPALA-8091
 URL: https://issues.apache.org/jira/browse/IMPALA-8091
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: David Knupp
Assignee: Thomas Tauber-Marshall


*Console log*:
{noformat}
19:10:05 2019-01-16 19:10:05,461 - archive_core_dumps - INFO - Found core 
files: ['./core.1547693849.30799.kudu-master', 
'./core.1547693849.30783.kudu-tserver', './core.1547693849.30767.kudu-tserver', 
'./core.1547693849.30808.kudu-tserver']
19:10:05 2019-01-16 19:10:05,649 - archive_core_dumps - INFO - [New LWP 30799]
19:10:05 [New LWP 30834]
19:10:05 [New LWP 30835]
19:10:05 [New LWP 30838]
19:10:05 [New LWP 30837]
19:10:05 [New LWP 30836]
19:10:05 Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
19:10:05 Program terminated with signal SIGABRT, Aborted.
19:10:05 #0  0x7f9a2cc611f7 in ?? ()
19:10:05 
19:10:05 2019-01-16 19:10:05,650 - archive_core_dumps - INFO - Found binary 
path through GDB: 
/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c
19:10:05 2019-01-16 19:10:05,893 - archive_core_dumps - WARNING - Failed to 
determine binary because multiple candidate binaries were found and none of 
their paths contained 'latest' to disambiguate:
19:10:05 Core:./core.1547693849.30799.kudu-master
19:10:05 
Binaries:['./testdata/cluster/node_templates/common/etc/init.d/kudu-master', 
'./testdata/cluster/cdh6/node-1/etc/init.d/kudu-master']
19:10:05 
19:10:05 2019-01-16 19:10:05,917 - archive_core_dumps - INFO - [New LWP 30783]
19:10:05 [New LWP 30810]
19:10:05 [New LWP 30812]
19:10:05 [New LWP 30811]
19:10:05 [New LWP 30824]
19:10:05 [New LWP 30820]
19:10:05 Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
19:10:05 Program terminated with signal SIGABRT, Aborted.
19:10:05 #0  0x7f0b81fb11f7 in ?? ()
{noformat}

*Backtraces*:
{noformat}
CORE: ./core.1547693849.30799.kudu-master
BINARY: ./be/build/latest/service/impalad
Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f9a2cc611f7 in ?? ()
#0  0x7f9a2cc611f7 in ?? ()
#1  0x7f9a2cc628e8 in ?? ()
#2  0x0020 in ?? ()
#3  0x in ?? ()

CORE: ./core.1547693849.30783.kudu-tserver
BINARY: ./be/build/latest/service/impalad
Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f0b81fb11f7 in ?? ()
#0  0x7f0b81fb11f7 in ?? ()
#1  0x7f0b81fb28e8 in ?? ()
#2  0x0020 in ?? ()
#3  0x in ?? ()

CORE: ./core.1547693849.30767.kudu-tserver
BINARY: ./be/build/latest/service/impalad
Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f60b1e2f1f7 in ?? ()
#0  0x7f60b1e2f1f7 in ?? ()
#1  0x7f60b1e308e8 in ?? ()
#2  0x0020 in ?? ()
#3  0x in ?? ()

CORE: ./core.1547693849.30808.kudu-tserver
BINARY: ./be/build/latest/service/impalad
Core was generated by 
`/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fa3cb5591f7 in ?? ()
#0  0x7fa3cb5591f7 in ?? ()
#1  0x7fa3cb55a8e8 in ?? ()
#2  0x0020 in ?? ()
#3  0x in ?? ()
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8090) disk-io-mgr-test SIGABRT'ed on centos6 exhaustive test run

2019-01-17 Thread David Knupp (JIRA)
David Knupp created IMPALA-8090:
---

 Summary: disk-io-mgr-test SIGABRT'ed on centos6 exhaustive test run
 Key: IMPALA-8090
 URL: https://issues.apache.org/jira/browse/IMPALA-8090
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: David Knupp
Assignee: Tim Armstrong


*Test output*:
{noformat}
45/99 Test #45: disk-io-mgr-test .***Exception: Other 43.29 sec
Turning perftools heap leak checking off
[==] Running 25 tests from 1 test case.
[--] Global test environment set-up.
[--] 25 tests from DiskIoMgrTest
[ RUN  ] DiskIoMgrTest.SingleWriter
19/01/16 15:57:09 INFO util.JvmPauseMonitor: Starting JVM pause monitor
[   OK ] DiskIoMgrTest.SingleWriter (3407 ms)
[ RUN  ] DiskIoMgrTest.InvalidWrite
[   OK ] DiskIoMgrTest.InvalidWrite (281 ms)
[ RUN  ] DiskIoMgrTest.WriteErrors
[   OK ] DiskIoMgrTest.WriteErrors (235 ms)
[ RUN  ] DiskIoMgrTest.SingleWriterCancel
[   OK ] DiskIoMgrTest.SingleWriterCancel (1165 ms)
[ RUN  ] DiskIoMgrTest.SingleReader
[   OK ] DiskIoMgrTest.SingleReader (5835 ms)
[ RUN  ] DiskIoMgrTest.SingleReaderSubRanges
[   OK ] DiskIoMgrTest.SingleReaderSubRanges (16404 ms)
[ RUN  ] DiskIoMgrTest.AddScanRangeTest
[   OK ] DiskIoMgrTest.AddScanRangeTest (1210 ms)
[ RUN  ] DiskIoMgrTest.SyncReadTest
*** Check failure stack trace: ***
@  0x4825dcc
@  0x4827671
@  0x48257a6
@  0x4828d6d
@  0x1af39ec
@  0x1ae90a4
@  0x1ac30ea
@  0x1accad3
@  0x1acc660
@  0x1acbf3e
@  0x1acb62d
@  0x1b03671
@  0x1f79988
@  0x1f82b60
@  0x1f82a84
@  0x1f82a47
@  0x3751579
@   0x3ea4807850
@   0x3ea44e894c
Wrote minidump to 
/data/jenkins/workspace/<...>/repos/Impala/logs/be_tests/minidumps/disk-io-mgr-test/5bbf76f7-e5d6-4ac9-bdae9d9b-065c32ec.dmp
{noformat}

*Error*:
{noformat}
Operating system: Linux
  0.0.0 Linux 2.6.32-358.14.1.el6.centos.plus.x86_64 #1 SMP Tue 
Jul 16 21:33:24 UTC 2013 x86_64
CPU: amd64
 family 6 model 45 stepping 7
 8 CPUs

GPU: UNKNOWN

Crash reason:  SIGABRT
Crash address: 0x4522fa1
Process uptime: not available

Thread 205 (crashed)
 0  libc-2.12.so + 0x328e5
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x06adf9c0
rsi = 0x0563   rdi = 0x2fa1
rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc78
 r8 = 0x7f8009b8fd00r9 = 0x0563
r10 = 0x0008   r11 = 0x0202
r12 = 0x06adfa40   r13 = 0x001f
r14 = 0x06ae7384   r15 = 0x06adf9c0
rip = 0x003ea44328e5
Found by: given as instruction pointer in context
 1  libc-2.12.so + 0x340c5
rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc80
rip = 0x003ea44340c5
Found by: stack scanning
 2  disk-io-mgr-test!boost::_bi::bind_t, 
boost::_bi::list2, 
boost::_bi::value > >::operator()() [bind_template.hpp 
: 20 + 0x21]
rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc88
rip = 0x01acbf3e
Found by: stack scanning
 3  disk-io-mgr-test!google::LogMessage::Flush() + 0x157
rbx = 0x0007   rbp = 0x06adf980
rsp = 0x7f8009b8fff0   rip = 0x048257a7
Found by: call frame info
 4  disk-io-mgr-test!google::LogMessageFatal::~LogMessageFatal() + 0xe
rbx = 0x7f8009b90110   rbp = 0x7f8009b903f0
rsp = 0x7f8009b90070   r12 = 0x0001
r13 = 0x06aee8b8   r14 = 0x0c213538
r15 = 0x0007   rip = 0x04828d6e
Found by: call frame info
 5  disk-io-mgr-test!impala::io::LocalFileReader::ReadFromPos(long, unsigned 
char*, long, long*, bool*) [local-file-reader.cc : 67 + 0x10]
rbx = 0x0001   rbp = 0x7f8009b903f0
rsp = 0x7f8009b90090   r12 = 0x0001
r13 = 0x06aee8b8   r14 = 0x0c213538
r15 = 0x0007   rip = 0x01af39ed
Found by: call frame info
 6  disk-io-mgr-test!impala::io::ScanRange::DoRead(int) [scan-range.cc : 219 + 
0x5b]
rbx = 0x0c4f71e0   rbp = 0x7f8009b90620
rsp = 0x7f8009b90400   r12 = 0x01af36e4
r13 = 0x000d   r14 = 0x0c213538
r15 = 0x0007   rip = 0x01ae90a5
Found by: call frame info
 7  
disk-io-mgr-test!impala::io::DiskQueue::DiskThreadLoop(impala::io::DiskIoMgr*) 
[disk-io-mgr.cc : 425 + 0x17]
rbx = 0x0c0e0f00   rbp = 0x7f8009b906c0
rsp = 0x7f8009b90630   r12 = 0x7fff99de21c0
r13 = 0x7fff99de1a90   r14 = 0x0

[jira] [Created] (IMPALA-8089) Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"

2019-01-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-8089:
---

 Summary: Sporadic upstream jenkins failures with "ERROR in 
bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"
 Key: IMPALA-8089
 URL: https://issues.apache.org/jira/browse/IMPALA-8089
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: David Knupp
Assignee: Bikramjeet Vig


Example failure at:
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/4113/consoleFull
{noformat}
04:32:09 ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID
04:32:09 Generated: 
/home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.run-all-tests.20190115_04_32_09.xml
04:32:09 + RET_CODE=1
{noformat}
Still looking, but I don't see any other obvious issues right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8078) test_corrupt_stats failing on exhaustive builds

2019-01-14 Thread David Knupp (JIRA)
David Knupp created IMPALA-8078:
---

 Summary: test_corrupt_stats failing on exhaustive builds
 Key: IMPALA-8078
 URL: https://issues.apache.org/jira/browse/IMPALA-8078
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: David Knupp


Stacktrace:
{noformat}
metadata/test_compute_stats.py:222: in test_corrupt_stats
self.run_test_case('QueryTest/corrupt-stats', vector, unique_database)
common/impala_test_suite.py:497: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:359: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:449: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:239: in verify_query_result_is_subset
assert expected_literal_strings <= actual_literal_strings
E   assert Items in expected results not found in actual results:
E '   partitions=1/2 files=1 size=24B row-size=0B cardinality=0'
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs

2018-11-02 Thread David Knupp (JIRA)
David Knupp created IMPALA-7804:
---

 Summary: Various scanner tests intermittently failing on S3 on 
different runs
 Key: IMPALA-7804
 URL: https://issues.apache.org/jira/browse/IMPALA-7804
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: David Knupp


Th failures have to do with getting AWS client credentials.

*query_test/test_scanners.py:696: in test_decimal_encodings*

_Stacktrace_
{noformat}
query_test/test_scanners.py:696: in test_decimal_encodings
self.run_test_case('QueryTest/parquet-decimal-formats', vector, 
unique_database)
common/impala_test_suite.py:496: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:358: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:438: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:260: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00
E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00
E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99
E -65535.00,-65535.00,-65535.00 != 
0.00,-.99,-.99
E -999.99,-999.99,-999.99 != 0.00,0.00,0.00
E -999.99,-999.99,-999.99 != 
0.00,.99,.99
E 0.00,-.99,-.99 != 
255.00,255.00,255.00
E 0.00,-.99,-.99 != 
65535.00,65535.00,65535.00
E 0.00,0.00,0.00 != 999.99,999.99,999.99
E 0.00,0.00,0.00 != None
E 0.00,.99,.99 != None
E 0.00,.99,.99 != None
E 255.00,255.00,255.00 != None
E 255.00,255.00,255.00 != None
E 65535.00,65535.00,65535.00 != None
E 65535.00,65535.00,65535.00 != None
E 999.99,999.99,999.99 != None
E 999.99,999.99,999.99 != None
E Number of rows returned (expected vs actual): 18 != 9
{noformat}

_Standard Error_
{noformat}
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE;

-- 2018-11-01 09:42:41,140 INFO MainThread: Started query 
4c4bc0e7b69d7641:130ffe73
SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_huge_num_rows_76a09ef1`;

-- 2018-11-01 09:42:42,402 INFO MainThread: Started query 
e34d714d6a62cba1:2a8544d0
-- 2018-11-01 09:42:42,405 INFO MainThread: Created database 
"test_huge_num_rows_76a09ef1" for test ID 
"query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 
1, 'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for 
impala-test-uswest2-1
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under 
fs.s3a.bucket.impala-test-uswest2-1.
18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period 
at 10 second(s).
18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
started
18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, 
using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider 
com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is 1500
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of 
fs.s3a.connection.establish.timeout is 5000
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.timeout is 
20
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.socket.send.buffer is 8192
18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.socket.recv.buffer is 8192
18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Using User-Agent: Hadoop 
3.0.0-cdh6.x-SNAPSHOT
18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.paging.maximum is 5000
18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.block.size is 33554432
18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.readahead.range is 65536
18/11/01 09:42:44 DEBUG

[jira] [Created] (IMPALA-7803) PlannerTest.testHbase failing on centos6 exhaustive test run

2018-11-02 Thread David Knupp (JIRA)
David Knupp created IMPALA-7803:
---

 Summary: PlannerTest.testHbase failing on centos6 exhaustive test 
run
 Key: IMPALA-7803
 URL: https://issues.apache.org/jira/browse/IMPALA-7803
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: David Knupp


*Error Message*
{noformat}
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id > '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 5\0:

NODE 0:

Expected:
  HBASE KEYRANGE 5\0:7
  HBASE KEYRANGE 7:
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id >= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 5:
^^
NODE 0:

Expected:
  HBASE KEYRANGE 5:7
  HBASE KEYRANGE 7:
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id > '4' and id <= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 4\0:5
^^
  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4\0:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id >= '4' and id <= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where string_col = '4' and tinyint_col = 5
  and id >= concat('', '4') and id <= concat('5', '')
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.alltypesagg
where bigint_col is not null and bool_col = true
Actual does not match expected result:
  HBASE KEYRANGE 3:5

  HBASE KEYRANGE 5:
  HBASE KEYRANGE :3
NODE 0:

Expected:
  HBASE KEYRANGE 3:7
  HBASE KEYRANGE 7:
  HBASE KEYRANGE :3
NODE 0:
Stacktrace
java.lang.AssertionError: section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id > '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 5\0:

NODE 0:

Expected:
  HBASE KEYRANGE 5\0:7
  HBASE KEYRANGE 7:
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id >= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 5:
^^
NODE 0:

Expected:
  HBASE KEYRANGE 5:7
  HBASE KEYRANGE 7:
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id > '4' and id <= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 4\0:5
^^
  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4\0:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where id >= '4' and id <= '5'
and tinyint_col = 5
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.stringids
where string_col = '4' and tinyint_col = 5
  and id >= concat('', '4') and id <= concat('5', '')
Actual does not match expected result:
  HBASE KEYRANGE 4:5

  HBASE KEYRANGE 5:5\0
NODE 0:

Expected:
  HBASE KEYRANGE 4:5\0
NODE 0:
section SCANRANGELOCATIONS of query:
select * from functional_hbase.alltypesagg
where bigint_col is not null and bool_col = true
Actual does not match expected result:
  HBASE KEYRANGE 3:5

  HBASE KEYRANGE 5:
  HBASE KEYRANGE :3
NODE 0:

Expected:
  HBASE KEYRANGE 3:7
  HBASE KEYRANGE 7:
  HBASE KEYRANGE :3
NODE 0:

at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:857)
at 
org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:862)
at org.apache.impala.planner.PlannerTest.testHbase(PlannerTest.java:126)
{noformat}



[jira] [Created] (IMPALA-7798) session-expiry-test passed with a minidump in an ASAN build

2018-11-01 Thread David Knupp (JIRA)
David Knupp created IMPALA-7798:
---

 Summary: session-expiry-test passed with a minidump in an ASAN 
build
 Key: IMPALA-7798
 URL: https://issues.apache.org/jira/browse/IMPALA-7798
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.1.0
Reporter: David Knupp


Noticed this from a recent ASF master ASAN test
*Standard Error*
{noformat}
Operating system: Linux
  0.0.0 Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 
20:32:50 UTC 2017 x86_64
CPU: amd64
 family 6 model 79 stepping 1
 16 CPUs

GPU: UNKNOWN

Crash reason:  SIGABRT
Crash address: 0x7d10001d333
Process uptime: not available

Thread 54 (crashed)
 0  libc-2.17.so + 0x351f7
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x60c000236818
rsi = 0x0001d36d   rdi = 0x0001d333
rbp = 0x0841aff0   rsp = 0x7f5dba815dd8
 r8 = 0x7f5dba818570r9 = 0x7f5dc8a44000
r10 = 0x0008   r11 = 0x0206
r12 = 0x60c000236740   r13 = 0x6110001801c0
r14 = 0x7f5dba815f80   r15 = 0x0febb7502bf0
rip = 0x7f65ef5b31f7
Found by: given as instruction pointer in context
 1  libc-2.17.so + 0x368e8
rsp = 0x7f5dba815de0   rip = 0x7f65ef5b48e8
Found by: stack scanning
 2  session-expiry-test!__interceptor___tls_get_addr 
[sanitizer_common_interceptors.inc : 4723 + 0xb]
rsp = 0x7f5dba815ed8   rip = 0x0165aa2b
Found by: stack scanning
 3  0x60c000236818
rbx = 0x60c000236818   rbp = 0x0841aff0
rsp = 0x7f5dba815ef8   rip = 0x60c000236818
Found by: call frame info
 4  libstdc++.so.6.0.20 + 0x5fd1d
rsp = 0x7f5dba815f10   rip = 0x7f65f00d5d1d
Found by: stack scanning
 5  session-expiry-test + 0x136f0d0
rsp = 0x7f5dba815f30   rip = 0x0176f0d0
Found by: stack scanning
 6  libstdc++.so.6.0.20 + 0x5dd86
rsp = 0x7f5dba815f40   rip = 0x7f65f00d3d86
Found by: stack scanning
 7  libstdc++.so.6.0.20 + 0x5ddd1
rsp = 0x7f5dba815f50   rip = 0x7f65f00d3dd1
Found by: stack scanning
 8  libstdc++.so.6.0.20 + 0x5dfe8
rsp = 0x7f5dba815f60   rip = 0x7f65f00d3fe8
Found by: stack scanning
 9  session-expiry-test!void 
boost::throw_exception(boost::lock_error const&) 
[throw_exception.hpp : 69 + 0x22]
rsp = 0x7f5dba815f80   rip = 0x0176ef8a
Found by: stack scanning
10  session-expiry-test!_fini + 0x8470
rsp = 0x7f5dba815f90   rip = 0x04ce7670
Found by: stack scanning
11  session-expiry-test + 0x136ee70
rsp = 0x7f5dba815f98   rip = 0x0176ee70
Found by: stack scanning
12  session-expiry-test!_fini + 0x3d2b60
rsp = 0x7f5dba816018   rip = 0x050b1d60
Found by: stack scanning
13  session-expiry-test!__asan_handle_no_return [asan_rtl.cc : 670 + 0xa]
rsp = 0x7f5dba816060   rip = 0x0171fd38
Found by: stack scanning
14  session-expiry-test!boost::mutex::lock() [mutex.hpp : 119 + 0xd]
rbx = 0x7f5dba8160a0   rbp = 0x7f5dba8160c0
rsp = 0x7f5dba8160a0   r12 = 0x7f5dba816190
rip = 0x0176fb73
Found by: call frame info
15  0x60700019a868
rbx = 0x0176fb73   rbp = 0x0759e368
rsp = 0x7f5dba8160d0   r12 = 0x41b58ab3
r13 = 0x04ce76dd   r14 = 0x0176fa90
r15 = 0x01d11846   rip = 0x60700019a868
Found by: call frame info
16  session-expiry-test!_fini + 0x57c5b
rsp = 0x7f5dba8160f0   rip = 0x04d36e5b
Found by: stack scanning
17  session-expiry-test + 0x179c460
rsp = 0x7f5dba8160f8   rip = 0x01b9c460
Found by: stack scanning
18  session-expiry-test!boost::unique_lock::~unique_lock() 
[lock_types.hpp : 329 + 0x5]
rsp = 0x7f5dba816160   rip = 0x0176dece
Found by: stack scanning
19  session-expiry-test!impala::StatsMetric::Update(double const&) 
[collection-metrics.h : 150 + 0x8]
rsp = 0x7f5dba8161a0   rip = 0x01d6d1a2
Found by: stack scanning
20  session-expiry-test!_fini + 0x78173
rsp = 0x7f5dba8161b0   rip = 0x04d57373
Found by: stack scanning
21  session-expiry-test + 0x196d100
rsp = 0x7f5dba8161b8   rip = 0x01d6d100
Found by: stack scanning
22  session-expiry-test!std::vector >::end() [stl_vector.h : 566 + 0xb]
rsp = 0x7f5dba8161c0   rip = 0x01d53c5e
Found by: stack scanning
23  session-expiry-test + 0x1953bc0
rsp = 0x7f5dba8161d8   rip = 0x01d53bc0
Found by: stack scanning
24  
session-expiry-test!impala::Statestore::SendTopicUpdate(impala::Statestore::Subscriber*,
 impala::Statestore::UpdateKind, bool*) [statestore.cc : 753 + 0x9]
{noformat}

Note however:
{noformat}
20:02:50   Start 49: session-expiry-test
20:0

[jira] [Resolved] (IMPALA-7796) TestAutomaticCatalogInvalidation custom cluster suite failing for both local and V1

2018-10-31 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-7796.
-
Resolution: Duplicate

> TestAutomaticCatalogInvalidation custom cluster suite failing for both local 
> and V1
> ---
>
> Key: IMPALA-7796
> URL: https://issues.apache.org/jira/browse/IMPALA-7796
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Assignee: Vuk Ercegovac
>Priority: Critical
>
> Both variants exceed max wait time.
> v1 catalog:
> {noformat}
> custom_cluster/test_automatic_invalidation.py:69: in test_v1_catalog
> self._run_test(cursor)
> custom_cluster/test_automatic_invalidation.py:64: in _run_test
> assert time.time() < max_wait_time
> E   assert 1541000646.910642 < 1541000646.673253
> E+  where 1541000646.910642 = ()
> E+where  = time.time
> {noformat}
> Local catalog
> {noformat}
> custom_cluster/test_automatic_invalidation.py:76: in test_local_catalog
> self._run_test(cursor)
> custom_cluster/test_automatic_invalidation.py:64: in _run_test
> assert time.time() < max_wait_time
> E   assert 1541000679.388713 < 1541000679.148656
> E+  where 1541000679.388713 = ()
> E+where  = time.time
> {noformat}
> Additionally, the v1 catalog test seemed to experience some connectivity 
> issues:
> {noformat}
> -- 2018-10-31 08:44:18,118 INFO MainThread: num_known_live_backends has 
> reached value: 3
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> Conn 
> -- 2018-10-31 08:44:18,214 INFO MainThread: Closing active operation
> -- 2018-10-31 08:44:18,215 ERRORMainThread: Failed to open transport 
> (tries_left=3)
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 940, in _execute
> return func(request)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 175, in OpenSession
> return self.recv_OpenSession()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
>  line 186, in recv_OpenSession
> (fname, mtype, rseqid) = self._iprot.readMessageBegin()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 126, in readMessageBegin
> sz = self.readI32()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
>  line 206, in readI32
> buff = self.trans.readAll(4)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 58, in readAll
> chunk = self.read(sz - have)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
>  line 159, in read
> self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py",
>  line 120, in read
> message='TSocket read 0 bytes')
> TTransportException: TSocket read 0 bytes
> -- 2018-10-31 08:44:19,133 INFO MainThread: Starting new HTTP connection 
> (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com
> -- 2018-10-31 08:44:20,150 INFO MainThread: Starting new HTTP connection 
> (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7796) TestAutomaticCatalogInvalidation custom cluster suite failing for both local and V1

2018-10-31 Thread David Knupp (JIRA)
David Knupp created IMPALA-7796:
---

 Summary: TestAutomaticCatalogInvalidation custom cluster suite 
failing for both local and V1
 Key: IMPALA-7796
 URL: https://issues.apache.org/jira/browse/IMPALA-7796
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: David Knupp
Assignee: Vuk Ercegovac


Both variants exceed max wait time.

v1 catalog:
{noformat}
custom_cluster/test_automatic_invalidation.py:69: in test_v1_catalog
self._run_test(cursor)
custom_cluster/test_automatic_invalidation.py:64: in _run_test
assert time.time() < max_wait_time
E   assert 1541000646.910642 < 1541000646.673253
E+  where 1541000646.910642 = ()
E+where  = time.time
{noformat}

Local catalog
{noformat}
custom_cluster/test_automatic_invalidation.py:76: in test_local_catalog
self._run_test(cursor)
custom_cluster/test_automatic_invalidation.py:64: in _run_test
assert time.time() < max_wait_time
E   assert 1541000679.388713 < 1541000679.148656
E+  where 1541000679.388713 = ()
E+where  = time.time
{noformat}

Additionally, the v1 catalog test seemed to experience some connectivity issues:
{noformat}
-- 2018-10-31 08:44:18,118 INFO MainThread: num_known_live_backends has 
reached value: 3
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
Conn 
-- 2018-10-31 08:44:18,214 INFO MainThread: Closing active operation
-- 2018-10-31 08:44:18,215 ERRORMainThread: Failed to open transport 
(tries_left=3)
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py",
 line 940, in _execute
return func(request)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
 line 175, in OpenSession
return self.recv_OpenSession()
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py",
 line 186, in recv_OpenSession
(fname, mtype, rseqid) = self._iprot.readMessageBegin()
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
 line 126, in readMessageBegin
sz = self.readI32()
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
 line 206, in readI32
buff = self.trans.readAll(4)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
 line 58, in readAll
chunk = self.read(sz - have)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py",
 line 159, in read
self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py",
 line 120, in read
message='TSocket read 0 bytes')
TTransportException: TSocket read 0 bytes
-- 2018-10-31 08:44:19,133 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com
-- 2018-10-31 08:44:20,150 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7783) test_default_timezone failing on real cluster

2018-10-30 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-7783.
-
Resolution: Fixed

> test_default_timezone failing on real cluster
> -
>
> Key: IMPALA-7783
> URL: https://issues.apache.org/jira/browse/IMPALA-7783
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Priority: Major
>
> shell/test_shell_commandline.py/test_default_timezone is failing due to 
> issues in asserting zoneinfo/tzname 
> {noformat}
> shell/test_shell_commandline.py:715: in test_default_timezone
> assert os.path.isfile("/usr/share/zoneinfo/" + tzname)
> E   assert (('/usr/share/zoneinfo/' + 
> 'SystemV/PST8PDT'))
> E+  where  =  '/data0/jenkins/workspace/Quasar-Executor/testing/inf...Impala-cdh-cluster-test-runner/infra/python/env/lib64/python2.7/posixpath.pyc'>.isfile
> E+where  '/data0/jenkins/workspace/Quasar-Executor/testing/inf...Impala-cdh-cluster-test-runner/infra/python/env/lib64/python2.7/posixpath.pyc'>
>  = os.path {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7783) test_default_timezone failing on real cluster

2018-10-30 Thread David Knupp (JIRA)
David Knupp created IMPALA-7783:
---

 Summary: test_default_timezone failing on real cluster
 Key: IMPALA-7783
 URL: https://issues.apache.org/jira/browse/IMPALA-7783
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: David Knupp


shell/test_shell_commandline.py/test_default_timezone is failing due to issues 
in asserting zoneinfo/tzname 
{noformat}
shell/test_shell_commandline.py:715: in test_default_timezone
assert os.path.isfile("/usr/share/zoneinfo/" + tzname)
E   assert (('/usr/share/zoneinfo/' + 
'SystemV/PST8PDT'))
E+  where  = .isfile
E+where 
 = os.path {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION

2018-10-29 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-7758.
-
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> chars_formats dependent tables are created using the wrong LOCATION
> ---
>
> Key: IMPALA-7758
> URL: https://issues.apache.org/jira/browse/IMPALA-7758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating 
> the various chars_formats tables (e.g. text) use:
> {noformat}
> LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text'
> {noformat}
> ...which resolves to {{/user/hive/warehouse/chars_formats_text}}
> However, the actual test warehouse root dir is {{/test-warehouse}}, not 
> {{/user/hive/warehouse}}.
> {noformat}
> $ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt
> abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable
>  length
> abc 
> ,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc
> abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1
> {noformat}
> versus...
> {noformat}
> $ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt
> cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such 
> file or directory
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-1780) Catch exceptions thrown by UDFs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-1780.
-
Resolution: Won't Fix

> Catch exceptions thrown by UDFs
> ---
>
> Key: IMPALA-1780
> URL: https://issues.apache.org/jira/browse/IMPALA-1780
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.1.1, Impala 2.3.0
>Reporter: Henry Robinson
>Priority: Major
>  Labels: crash, downgraded
>
> Catch exceptions thrown by UDFs so Impala doesn't crash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-3959) data loading jenkins jobs don't save test logs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-3959.
-
Resolution: Invalid

> data loading jenkins jobs don't save test logs
> --
>
> Key: IMPALA-3959
> URL: https://issues.apache.org/jira/browse/IMPALA-3959
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.7.0
>Reporter: Michael Brown
>Priority: Critical
>
> Even though data loading jobs run BE, FE, and core EE tests, the logs for 
> these tests are not saved. (The only artifacts saved are those used by 
> snapshot consumers: the snapshot, metastore snapshot, and git hash). This is 
> a problem when there are flaky tests that fail there that we haven't seen 
> fail elsewhere: we have no forensic evidence to search through for clues.
> Example:
> http://sandbox.jenkins.cloudera.com/job/impala-asf-master-core-data-load/29/
> {noformat}
> 22:55:09 99% tests passed, 1 tests failed out of 78
> 22:55:09 
> 22:55:09 The following tests FAILED:
> 22:55:09   13 - kudu-scan-node-test (OTHER_FAULT)
> 22:55:09 Errors while running CTest
> 22:55:09 make: *** [test] Error 8
> {noformat}
> This kudu scan node test failed, but we have no other info on it, because we 
> have no artifacts.
> Part of the problem is that the data load job has a separate entry point, so 
> everything built up in {{Impala-aux/jenkins/build.sh}} to handle archiving 
> doesn't exist for {{Impala-aux/jenkins/build-data-load.sh}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-6814) query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp closed IMPALA-6814.
---
Resolution: Cannot Reproduce

Apparently not. The actual line was apparently {{row_regex: .*Error parsing 
row: file: $NAMENODE/.* before offset: \d+}}, and $NAMENODE was resolving to 
"localhost." Some other change somewhere must have fixed that, because it's now 
resolving properly.

> query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on 
> remote clusters
> -
>
> Key: IMPALA-6814
> URL: https://issues.apache.org/jira/browse/IMPALA-6814
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
>
> It looks like {{localhost}} is hardcoded in the test verification.
>  
> *Stacktrace*
> query_test/test_queries.py:161: in test_strict_mode
>  self.run_test_case('QueryTest/strict-mode', vector)
>  common/impala_test_suite.py:427: in run_test_case
>  self.__verify_results_and_errors(vector, test_section, result, use_db)
>  common/impala_test_suite.py:300: in __verify_results_and_errors
>  replace_filenames_with_placeholder)
>  common/test_result_verifier.py:317: in verify_raw_results
>  verify_errors(expected_errors, actual_errors)
>  common/test_result_verifier.py:274: in verify_errors
>  VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
>  common/test_result_verifier.py:231: in verify_query_result_is_equal
>  assert expected_results == actual_results
>  E assert Comparing QueryTestResults (expected vs actual):
>  [...]
>  E row_regex: .*Error parsing row: file: 
> hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != 
> 'Error parsing row: file: 
> hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: 
> 454'
>  E row_regex: .*Error parsing row: file: 
> hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != 
> 'Error parsing row: file: 
> hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: 
> 454'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION

2018-10-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-7758:
---

 Summary: chars_formats dependent tables are created using the 
wrong LOCATION
 Key: IMPALA-7758
 URL: https://issues.apache.org/jira/browse/IMPALA-7758
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: David Knupp


In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating 
the various chars_formats tables (e.g. text) use:
{noformat}
LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text'
{noformat}

...which resolves to {{/user/hive/warehouse/chars_formats_text}}

However, the actual test warehouse root dir is {{/test-warehouse}}, not 
{{/user/hive/warehouse}}.

{noformat}
$ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt
abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable
 length
abc 
,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc
abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1
{noformat}

versus...
{noformat}
$ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt
cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such file 
or directory
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-7584) test_set fails when run against external cluster

2018-09-21 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp closed IMPALA-7584.
---
Resolution: Fixed

Change submitted.
https://gerrit.cloudera.org/c/11476/

> test_set fails when run against external cluster
> 
>
> Key: IMPALA-7584
> URL: https://issues.apache.org/jira/browse/IMPALA-7584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: David Knupp
>Priority: Major
>
> Similar to IMPALA-6810, test_set fails:
> {noformat}
> E   AssertionError: Unexpected exception string. Expected: Rejected query 
> from pool default-pool: minimum memory reservation
> E   Not found in actual: ImpalaBeeswaxException: Query aborted:Rejected query 
> from pool root.jenkins: minimum memory reservation is greater than memory 
> available to the query for buffer reservations.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7399) Add a script for generating junit XML type output for arbitrary build steps

2018-08-23 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-7399.
-
Resolution: Fixed

> Add a script for generating junit XML type output for arbitrary build steps
> ---
>
> Key: IMPALA-7399
> URL: https://issues.apache.org/jira/browse/IMPALA-7399
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Infrastructure
>Affects Versions: Not Applicable
>Reporter: David Knupp
>Priority: Major
>
> Junit XML has become a defacto standard for outputting automated test 
> results. Jenkins consumes junit XML output to generate final test reports. 
> This makes triaging failed builds much easier.
> While Impala's test frameworks already produce junit XML, it would be nice to 
> take produce junit XML for earlier build steps that aren't formally tests, 
> e.g., compilation and data loading. This will make it easier to diagnose 
> these failures on jobs that run on jenkins.impala.io (as opposed to requiring 
> users to read through raw console output.)
> This script is also being used as a starting point for setting up a formal 
> internal python library for Impala development that can be installed via 
> {{pip install -e }}. We expect other packages to 
> be added to this library over time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7399) Add a script for generating junit XML type output for arbitrary build steps

2018-08-06 Thread David Knupp (JIRA)
David Knupp created IMPALA-7399:
---

 Summary: Add a script for generating junit XML type output for 
arbitrary build steps
 Key: IMPALA-7399
 URL: https://issues.apache.org/jira/browse/IMPALA-7399
 Project: IMPALA
  Issue Type: New Feature
  Components: Infrastructure
Affects Versions: Not Applicable
Reporter: David Knupp


Junit XML has become a defacto standard for outputting automated test results. 
Jenkins consumes junit XML output to generate final test reports. This makes 
triaging failed builds much easier.

While Impala's test frameworks already produce junit XML, it would be nice to 
take produce junit XML for earlier build steps that aren't formally tests, 
e.g., compilation and data loading. This will make it easier to diagnose these 
failures on jobs that run on jenkins.impala.io (as opposed to requiring users 
to read through raw console output.)

This script is also being used as a starting point for setting up a formal 
internal python library for Impala development that can be installed via {{pip 
install -e }}. We expect other packages to be added 
to this library over time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7379) test_random_rpc_timeout failed on exhaustive build: Debug webpage did not become available in expected time

2018-07-31 Thread David Knupp (JIRA)
David Knupp created IMPALA-7379:
---

 Summary: test_random_rpc_timeout failed on exhaustive build: Debug 
webpage did not become available in expected time
 Key: IMPALA-7379
 URL: https://issues.apache.org/jira/browse/IMPALA-7379
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: David Knupp
Assignee: Michael Ho


*Stacktrace*
{noformat}
custom_cluster/test_rpc_timeout.py:131: in test_random_rpc_timeout
self.execute_query_verify_metrics(self.TEST_QUERY, 10)
custom_cluster/test_rpc_timeout.py:51: in execute_query_verify_metrics
v.wait_for_metric("impala-server.num-fragments-in-flight", 0)
verifiers/metric_verifier.py:62: in wait_for_metric
self.impalad_service.wait_for_metric_value(metric_name, expected_value, 
timeout)
common/impala_service.py:132: in wait_for_metric_value
json.dumps(self.read_debug_webpage('memz?json')),
common/impala_service.py:63: in read_debug_webpage
return self.open_debug_webpage(page_name, timeout=timeout, 
interval=interval).read()
common/impala_service.py:60: in open_debug_webpage
assert 0, 'Debug webpage did not become available in expected time.'
E   AssertionError: Debug webpage did not become available in expected time.
{noformat}

*Standard Error*
{noformat}
10:27:58 MainThread: Starting State Store logging to 
/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO
10:27:58 MainThread: Starting Catalog Service logging to 
/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
10:27:59 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO
10:28:00 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
10:28:01 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
10:28:04 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
10:28:04 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000
10:28:04 MainThread: Waiting for num_known_live_backends=3. Current value: 1
10:28:05 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000
10:28:05 MainThread: Waiting for num_known_live_backends=3. Current value: 2
10:28:06 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000
10:28:06 MainThread: num_known_live_backends has reached value: 3
10:28:06 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25001
10:28:06 MainThread: num_known_live_backends has reached value: 3
10:28:06 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25002
10:28:06 MainThread: num_known_live_backends has reached value: 3
10:28:06 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 
executors).
MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
MainThread: Getting metric: statestore.live-backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25010
MainThread: Metric 'statestore.live-backends' has reached desired value: 4
MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25001
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25002
MainThread: num_known_live_backends has reached value: 3
-- connecting to: localhost:21000
-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functional.alltypessmall c2;

-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functional.alltypessmall c2;

-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functional.alltypessmall c2;

-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functional.alltypessmall c2;

-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functional.alltypessmall c2;

-- executing against localhost:21000
select count(c2.string_col) from  functional.alltypestiny join 
functiona

[jira] [Created] (IMPALA-7378) test_strict_mode failed on an ASAN build: expected "Error converting column: 5 to DOUBLE"

2018-07-31 Thread David Knupp (JIRA)
David Knupp created IMPALA-7378:
---

 Summary: test_strict_mode failed on an ASAN build: expected "Error 
converting column: 5 to DOUBLE"
 Key: IMPALA-7378
 URL: https://issues.apache.org/jira/browse/IMPALA-7378
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: David Knupp
Assignee: Tim Armstrong


*Stacktrace*
{noformat}
query_test/test_queries.py:159: in test_strict_mode
self.run_test_case('QueryTest/strict-mode-abort', vector)
common/impala_test_suite.py:420: in run_test_case
assert False, "Expected exception: %s" % expected_str
E   AssertionError: Expected exception: Error converting column: 5 to DOUBLE
{noformat}

*Standard Error*
{noformat}
-- executing against localhost:21000
use functional;

SET strict_mode=1;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=0;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000
select * from overflow;

-- executing against localhost:21000
use functional;

SET strict_mode=1;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000
select tinyint_col from overflow;

-- executing against localhost:21000
select smallint_col from overflow;

-- executing against localhost:21000
select int_col from overflow;

-- executing against localhost:21000
select bigint_col from overflow;

-- executing against localhost:21000
select float_col from overflow;

-- executing against localhost:21000
select double_col from overflow;
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7371) TestInsertQueries.test_insert fails on S3 with 0 rows returned

2018-07-31 Thread David Knupp (JIRA)
David Knupp created IMPALA-7371:
---

 Summary: TestInsertQueries.test_insert fails on S3 with 0 rows 
returned
 Key: IMPALA-7371
 URL: https://issues.apache.org/jira/browse/IMPALA-7371
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: David Knupp


Stacktrace
{noformat}
query_test/test_insert.py:118: in test_insert
multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
/data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:426:
 in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
/data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:299:
 in __verify_results_and_errors
replace_filenames_with_placeholder)
/data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:434:
 in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
/data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:261:
 in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 75,false,0,0,0,0,0,0,'04/01/09','0' != None
E 76,true,1,1,1,10,1.10023841858,10.1,'04/01/09','1' != None
E 77,false,2,2,2,20,2.20047683716,20.2,'04/01/09','2' != None
E 78,true,3,3,3,30,3.29952316284,30.3,'04/01/09','3' != None
E 79,false,4,4,4,40,4.40095367432,40.4,'04/01/09','4' != None
E 80,true,5,5,5,50,5.5,50.5,'04/01/09','5' != None
E 81,false,6,6,6,60,6.59904632568,60.6,'04/01/09','6' != None
E 82,true,7,7,7,70,7.69809265137,70.7,'04/01/09','7' != None
E 83,false,8,8,8,80,8.80190734863,80.8,'04/01/09','8' != None
E 84,true,9,9,9,90,9.89618530273,90.91,'04/01/09','9' != 
None
E 85,false,0,0,0,0,0,0,'04/02/09','0' != None
E 86,true,1,1,1,10,1.10023841858,10.1,'04/02/09','1' != None
E 87,false,2,2,2,20,2.20047683716,20.2,'04/02/09','2' != None
E 88,true,3,3,3,30,3.29952316284,30.3,'04/02/09','3' != None
E 89,false,4,4,4,40,4.40095367432,40.4,'04/02/09','4' != None
E 90,true,5,5,5,50,5.5,50.5,'04/02/09','5' != None
E 91,false,6,6,6,60,6.59904632568,60.6,'04/02/09','6' != None
E 92,true,7,7,7,70,7.69809265137,70.7,'04/02/09','7' != None
E 93,false,8,8,8,80,8.80190734863,80.8,'04/02/09','8' != None
E 94,true,9,9,9,90,9.89618530273,90.91,'04/02/09','9' != 
None
E 95,false,0,0,0,0,0,0,'04/03/09','0' != None
E 96,true,1,1,1,10,1.10023841858,10.1,'04/03/09','1' != None
E 97,false,2,2,2,20,2.20047683716,20.2,'04/03/09','2' != None
E 98,true,3,3,3,30,3.29952316284,30.3,'04/03/09','3' != None
E 99,false,4,4,4,40,4.40095367432,40.4,'04/03/09','4' != None
E Number of rows returned (expected vs actual): 25 != 0
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7155) Make it designate a large number of arbitrary tests for targeted test runs

2018-06-08 Thread David Knupp (JIRA)
David Knupp created IMPALA-7155:
---

 Summary: Make it designate a large number of arbitrary tests for 
targeted test runs
 Key: IMPALA-7155
 URL: https://issues.apache.org/jira/browse/IMPALA-7155
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: David Knupp


It's already possible to specify an arbitrary list of test modules, test 
classes, and/or test functions as command line parameters when running the 
Impala mini-cluster tests. It's also possible to opt-out of running specific 
tests by applying any of a variety of skipif markers.

What we don't have is a comprehensive way for tests to be opted-in to a 
targeted test run, other than by naming it as a command line parameter. This 
becomes extremely unwieldy beyond a certain number of tests. In fact, we don't 
have a general concept of targeted test runs at all. The approach to date has 
been to always run as many tests as possible, except for those tests 
specifically marked for skipping. This is a OK way to make sure tests don't get 
overlooked, but it also results in many tests frequently being run in contexts 
in which they don't necessarily apply, e.g. against S3, or against actual 
deployed clusters, which can lead to false negatives.

There are different ways that we could group together a disparate array of 
tests into a targeted run. We could come up with a permanent series of new 
pytest markers/decorators for opting-in, as opposed to opting-out, of a given 
test run. An initial pass would then need to be made to apply the new 
decorators as needed to all of the existing tests. One could then invoke 
something like "impala-pytest -m cluster_tests" as needed.

Another approach might be to define test runs in special files (probably yaml). 
The file would include a list of which tests to run, possibly along with other 
test parameters, e.g. "run this list of tests, but only on parquet, and skip 
tests that require LZO compression."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread David Knupp (JIRA)
David Knupp created IMPALA-7088:
---

 Summary: Parallel data load breaks load-data.py if loading data on 
a real cluster
 Key: IMPALA-7088
 URL: https://issues.apache.org/jira/browse/IMPALA-7088
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: David Knupp


Impala/bin/load-data.py is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual remote cluster.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6317) Expose -cmake_only flag to buildall.sh

2018-05-29 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6317.
-
Resolution: Fixed

> Expose -cmake_only flag to buildall.sh
> --
>
> Key: IMPALA-6317
> URL: https://issues.apache.org/jira/browse/IMPALA-6317
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.11.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Minor
>
> Impala/bin/make_impala.sh has a {{-cmake_only}} command line option:
> {noformat}
> -cmake_only)
>   CMAKE_ONLY=1
> {noformat}
> Passing this flag means that makefiles only will be generated during the 
> build. However, this flag is not provided in buildall.sh (the caller of 
> make_impala.sh) which effectively renders it useless.
> It turns out that if one has no intention of running the Impala cluster 
> locally (e.g., as when trying to build just enough of the toolchain and dev 
> environment to run the data load scripts for loading data onto a remote 
> cluster) then being able to only generate makefiles is a useful thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6600) py.test error "Replacing crashed slave gw1" in test_spilling

2018-05-24 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6600.
-
Resolution: Duplicate

> py.test error "Replacing crashed slave gw1" in test_spilling
> 
>
> Key: IMPALA-6600
> URL: https://issues.apache.org/jira/browse/IMPALA-6600
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Lars Volker
>Assignee: David Knupp
>Priority: Major
>  Labels: broken-build, flaky
>
> I saw a build fail with "Replacing crashed slave gw1". Here's the failing 
> part of the log:
> {noformat}
> 12:18:33 [gw0] PASSED 
> query_test/test_tablesample.py::TestTableSample::test_tablesample[repeatable: 
> False | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 12:18:33 [gw1] node down: Not properly terminated
> 12:18:33 [gw1] FAILED 
> query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none] 
> 12:18:33 Replacing crashed slave gw1
> 12:18:34 
> 12:18:34 unittests/test_file_parser.py::TestTestFileParser::test_valid_parse 
> {noformat}
> Here is the summary:
> {noformat}
> 12:44:14 === FAILURES 
> ===
> 12:44:14 _ query_test/test_spilling.py 
> __
> 12:44:14 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core-local/repos/Impala/bin/../infra/python/env/bin/python
> 12:44:14 Slave 'gw1' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> 12:44:14 == 1 failed, 1494 passed, 404 skipped, 9 xfailed in 10374.68 
> seconds ===
> {noformat}
> [~dknupp] - I’m assigning this to you thinking you might have an idea what’s 
> going on here; feel free to find another person or assign back to me if 
> you're swamped.
> I’ve seen this happen in a private Jenkins run. Please ping me if you would 
> like access to the build artifacts.
> I've also seen a similar error message in IMPALA-5724 and in [this GRPC issue 
> on Github|https://github.com/grpc/grpc/issues/3577].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7055) test_avro_writer failing on upstream Jenkins:

2018-05-21 Thread David Knupp (JIRA)
David Knupp created IMPALA-7055:
---

 Summary: test_avro_writer failing on upstream Jenkins: 
 Key: IMPALA-7055
 URL: https://issues.apache.org/jira/browse/IMPALA-7055
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: David Knupp


This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, but 
it is not related to that patch. The failing build is 
https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/. 

Test appears to be (from 
[avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]):
{noformat}
 QUERY
SET ALLOW_UNSUPPORTED_FORMATS=0;
insert into __avro_write select 1, "b", 2.2;
 CATCH
Writing to table format AVRO is not supported. Use query option 
ALLOW_UNSUPPORTED_FORMATS
{noformat}

Error output:
{noformat}
01:50:18 ] FAIL 
query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]
01:50:18 ] === FAILURES 
===
01:50:18 ]  TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none] 
01:50:18 ] [gw9] linux2 -- Python 2.7.12 
/home/ubuntu/Impala/bin/../infra/python/env/bin/python
01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer
01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector)
01:50:18 ] common/impala_test_suite.py:420: in run_test_case
01:50:18 ] assert False, "Expected exception: %s" % expected_str
01:50:18 ] E   AssertionError: Expected exception: Writing to table format AVRO 
is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
01:50:18 ]  Captured stderr setup 
-
01:50:18 ] -- connecting to: localhost:21000
01:50:18 ] - Captured stderr call 
-
01:50:18 ] -- executing against localhost:21000
01:50:18 ] use functional;
01:50:18 ] 
01:50:18 ] SET batch_size=0;
01:50:18 ] SET num_nodes=0;
01:50:18 ] SET disable_codegen_rows_threshold=5000;
01:50:18 ] SET disable_codegen=False;
01:50:18 ] SET abort_on_error=1;
01:50:18 ] SET exec_single_node_rows_threshold=0;
01:50:18 ] -- executing against localhost:21000
01:50:18 ] drop table if exists __avro_write;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC=NONE;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] create table __avro_write (i int, s string, d double)
01:50:18 ] stored as AVRO
01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{
01:50:18 ]   "name": "my_record",
01:50:18 ]   "type": "record",
01:50:18 ]   "fields": [
01:50:18 ]   {"name":"i", "type":["int", "null"]},
01:50:18 ]   {"name":"s", "type":["string", "null"]},
01:50:18 ]   {"name":"d", "type":["double", "null"]}]}');
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC="";
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC=NONE;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] insert into __avro_write select 0, "a", 1.1;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC="";
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0";
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC=SNAPPY;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] insert into __avro_write select 1, "b", 2.2;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET COMPRESSION_CODEC="";
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0";
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] select * from __avro_write;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=0;
01:50:18 ] 
01:50:18 ] -- executing against localhost:21000
01:50:18 ] 
01:50:18 ] insert into __avro_write sel

[jira] [Resolved] (IMPALA-4464) Remove remote_data_load.py

2018-05-14 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-4464.
-
Resolution: Fixed

> Remove remote_data_load.py
> --
>
> Key: IMPALA-4464
> URL: https://issues.apache.org/jira/browse/IMPALA-4464
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>  Labels: remote_cluster_test
>
> A patch was recently submitted that allows data load and end-end tests to run 
> against a remote cluster. At its core was this file:
> https://github.com/apache/incubator-impala/blob/master/bin/remote_data_load.py
> However, while this script relies on several changes to existing build and 
> test scripts, nothing else in turns relies on it. In retrospect, it does not 
> make sense to have this script in the Impala repo if nothing can use it 
> externally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6938) Build failing because failed to assign HBase regions during data load

2018-04-26 Thread David Knupp (JIRA)
David Knupp created IMPALA-6938:
---

 Summary: Build failing because failed to assign HBase regions 
during data load 
 Key: IMPALA-6938
 URL: https://issues.apache.org/jira/browse/IMPALA-6938
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: David Knupp


00:53:41 Splitting HBase (logging to 
/data/jenkins/workspace/impala-cdh6.x-core/repos/Impala/logs/data_loading/create-hbase.log)...
 
00:55:29 FAILED (Took: 1 min 48 sec)
00:55:29 
'/data/jenkins/workspace/impala-cdh6.x-core/repos/Impala/testdata/bin/split-hbase.sh'
 failed. Tail of log:

00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: 
functional_hbase.alltypesagg,3,1524642858707.b0a6c361d408d230442311281afbefc8. 
3 -> localhost:16202, expecting localhost,16202,1524639822962
[...]
00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: 
functional_hbase.alltypesagg,7,1524642862231.a7e1c97240f425f98cddaa1e9070651d. 
7 -> localhost:16203, expecting localhost,16203,1524639824558
00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: 
functional_hbase.alltypesagg,9,1524642862231.4f6b1fc8c0104c6b7ef782dfa3d3d616. 
9 -> localhost:16203, expecting localhost,16203,1524639824558
00:55:29 Exception in thread "main" java.lang.IllegalStateException: Failed to 
assign regions to servers after 6 millis.
00:55:29at 
org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.performAssigment(HBaseTestDataRegionAssigment.java:198)
00:55:29at 
org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:330)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6935) test_analytic_fns failed during exhaustive build on RHEL7: AnalysisException: Couldn't evaluate LEAD/LAG offset

2018-04-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-6935:
---

 Summary: test_analytic_fns failed during exhaustive build on 
RHEL7: AnalysisException: Couldn't evaluate LEAD/LAG offset
 Key: IMPALA-6935
 URL: https://issues.apache.org/jira/browse/IMPALA-6935
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.13.0
Reporter: David Knupp


Stacktrace
{noformat}
query_test/test_queries.py:53: in test_analytic_fns
self.run_test_case('QueryTest/analytic-fns', vector)
common/impala_test_suite.py:398: in run_test_case
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:613: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: IllegalStateException: Failed analysis after expr substitution.
E   CAUSED BY: AnalysisException: Couldn't evaluate LEAD/LAG offset: couldn't 
execute expr 87
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"

2018-04-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-6933:
---

 Summary: test_kudu.TestCreateExternalTable on S3 failing with 
"AlreadyExistsException: Database already exists"
 Key: IMPALA-6933
 URL: https://issues.apache.org/jira/browse/IMPALA-6933
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: David Knupp


Error Message
{noformat}
test setup failure
{noformat}

Stacktrace
{noformat}
conftest.py:347: in conn
with __unique_conn(db_name=db_name, timeout=timeout) as conn:
/usr/lib64/python2.6/contextlib.py:16: in __enter__
return self.gen.next()
conftest.py:380: in __unique_conn
cur.execute("CREATE DATABASE %s" % db_name)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in 
execute
configuration=configuration)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in 
execute_async
self._execute_async(op)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in 
_execute_async
operation_fn()
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in op
async=True)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: in 
execute
return self._operation('ExecuteStatement', req)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in 
_operation
resp = self._rpc(kind, request)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in 
_rpc
err_if_rpc_not_ok(response)
../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in 
err_if_rpc_not_ok
raise HiveServer2Error(resp.status.errorMessage)
E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' RPC 
to Hive Metastore: 
E   CAUSED BY: AlreadyExistsException: Database f0mraw already exists
{noformat}

Tests affected:
* query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col
* query_test.test_kudu.TestCreateExternalTable.test_drop_external_table
* query_test.test_kudu.TestCreateExternalTable.test_explicit_name
* query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference
* query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist
* 
query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does
* query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning
* query_test.test_kudu.TestCreateExternalTable.test_column_name_case
* query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6931) TestQueryExpiration.test_query_expiration fails on ASAN with unexpected number of expired queries

2018-04-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-6931:
---

 Summary: TestQueryExpiration.test_query_expiration fails on ASAN 
with unexpected number of expired queries
 Key: IMPALA-6931
 URL: https://issues.apache.org/jira/browse/IMPALA-6931
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Assignee: Vuk Ercegovac


Stacktrace
{noformat}
custom_cluster/test_query_expiration.py:108: in test_query_expiration
client.QUERY_STATES['EXCEPTION'])
custom_cluster/test_query_expiration.py:184: in __expect_client_state
assert expected_state == actual_state
E   assert 5 == 4
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6928) test_bloom_filters failing on ASAN build: did not find "Runtime Filter Published" in profile

2018-04-25 Thread David Knupp (JIRA)
David Knupp created IMPALA-6928:
---

 Summary: test_bloom_filters failing on ASAN build: did not find 
"Runtime Filter Published" in profile
 Key: IMPALA-6928
 URL: https://issues.apache.org/jira/browse/IMPALA-6928
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Assignee: Thomas Tauber-Marshall



Stacktrace
{noformat}
query_test/test_runtime_filters.py:81: in test_bloom_filters
self.run_test_case('QueryTest/bloom_filters', vector)
common/impala_test_suite.py:444: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:560: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex: .*1 of 1 Runtime Filter Published.*
E   
E   ACTUAL PROFILE:
E   Query (id=a64a18654d28e0c3:e6220f6c):
E DEBUG MODE WARNING: Query profile created while running a DEBUG build of 
Impala. Use RELEASE builds to measure query performance.
E Summary:
E   Session ID: 244e6109f4226b2b:39160855c64ad4a1
E   Session Type: BEESWAX
E   Start Time: 2018-04-23 23:31:59.326883000
E   End Time: 
E   Query Type: QUERY
E   Query State: FINISHED
E   Query Status: OK
E   Impala Version: impalad version 2.12.0-cdh5.15.0 DEBUG (build 
3d60947b813429cd1db59f9a342498982d341de9)
E   User: jenkins
E   Connected User: jenkins
E   Delegated User: 
E   Network Address: 127.0.0.1:55776
E   Default Db: functional
E   Sql Statement: with l as (select * from tpch.lineitem UNION ALL select 
* from tpch.lineitem)
E   select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
E   join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey
E   Coordinator: ec2-m2-4xlarge-centos-6-4-0f06.vpc.cloudera.com:22000
E   Query Options (set by configuration): 
ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0
E   Query Options (set by configuration and planner): 
ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,MT_DOP=0,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0
E   Plan: 
E   
E   Max Per-Host Resource Reservation: Memory=19.00MB
E   Per-Host Resource Estimates: Memory=557.00MB
E   
E   F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
E   |  Per-Host Resources: mem-estimate=28.00MB mem-reservation=18.00MB 
runtime-filters-memory=1.00MB
E   PLAN-ROOT SINK
E   |  mem-estimate=0B mem-reservation=0B
E   |
E   05:AGGREGATE [FINALIZE]
E   |  output: count(*)
E   |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
E   |  tuple-ids=7 row-size=8B cardinality=1
E   |
E   04:HASH JOIN [INNER JOIN, BROADCAST]
E   |  hash predicates: a.l_orderkey = -1 * l_orderkey
E   |  fk/pk conjuncts: assumed fk/pk
E   |  runtime filters: RF000[bloom] <- -1 * l_orderkey
E   |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB
E   |  tuple-ids=0,4 row-size=16B cardinality=1
E   |
E   |--08:EXCHANGE [UNPARTITIONED]
E   |  |  mem-estimate=0B mem-reservation=0B
E   |  |  tuple-ids=4 row-size=8B cardinality=200
E   |  |
E   |  F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
E   |  Per-Host Resources: mem-estimate=0B mem-reservation=0B
E   |  07:EXCHANGE [UNPARTITIONED]
E   |  |  limit: 200
E   |  |  mem-estimate=0B mem-reservation=0B
E   |  |  tuple-ids=4 row-size=8B cardinality=200
E   |  |
E   |  F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
E   |  Per-Host Resources: mem-estimate=264.00MB mem-reservation=0B
E   |  01:UNION
E   |  |  pass-through-operands: all
E   |  |  limit: 200
E   |  |  mem-estimate=0B mem-reservation=0B
E   |  |  tuple-ids=4 row-size=8B cardinality=200
E   |  |
E   |  |--03:SCAN HDFS [tpch.lineitem, RANDOM]
E   |  | partitions=1/1 files=1 size=718.94MB
E   |  | stored statistics:
E   |  |   table: rows=6001215 size=718.94MB
E   |  |   columns: all
E   |  | extrapolated-rows=disabled
E   |  | mem-estimate=264.00MB mem-reservation=0B
E   |  | tuple-ids=3 row-size=8B cardinality=6001215
E   |  |
E   |  02:SCAN HDFS [tpch.lineitem, RANDOM]
E   | partitions=1/1 files=1 size=718.94MB
E   | stored statistics:
E   |   table: rows=6001215 size=718.94MB
E   |   columns: all
E   | extrapolated-rows=disabled
E   | mem-estimate=264.00MB mem-reservation=0B
E   | tuple-ids=2 row-size=8B cardinality=6001215
E   |
E   06:EXCHANGE [UNPARTITIONED]
E   |  limit: 1
E   |  mem-estimate=0B mem-reservation=0B
E   |  tuple-ids=0 row-size=8B cardinality=1
E   |
E   F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
E   Per-Host Resources: mem-estimate=265.

[jira] [Created] (IMPALA-6925) 'load-data functional-query exhaustive' failed: exception in load-data worker thread

2018-04-24 Thread David Knupp (JIRA)
David Knupp created IMPALA-6925:
---

 Summary: 'load-data functional-query exhaustive' failed: exception 
in load-data worker thread
 Key: IMPALA-6925
 URL: https://issues.apache.org/jira/browse/IMPALA-6925
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: David Knupp


{noformat}
17:10:41 Error executing hive SQL: 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-seq-snap-record.sql
 See: 
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-seq-snap-record.sql.log
Exception in thread Thread-6 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
  File "/usr/lib64/python2.6/threading.py", line 484, in run
  File "/usr/lib64/python2.6/multiprocessing/pool.py", line 68, in worker
  File 
"/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/bin/load-data.py",
 line 146, in exec_hive_query_from_file_beeline
: 'NoneType' object has no attribute 'info'
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6922) test_kudu_insert on exhaustive build

2018-04-24 Thread David Knupp (JIRA)
David Knupp created IMPALA-6922:
---

 Summary: test_kudu_insert on exhaustive build
 Key: IMPALA-6922
 URL: https://issues.apache.org/jira/browse/IMPALA-6922
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: David Knupp
Assignee: Thomas Tauber-Marshall


Error Message
{noformat}
query_test/test_kudu.py:84: in test_kudu_insert 
self.run_test_case('QueryTest/kudu_insert', vector, use_db=unique_database) 
common/impala_test_suite.py:455: in run_test_case 
pytest.config.option.update_results, result_section='DML_RESULTS') 
common/test_result_verifier.py:404: in verify_raw_results 
VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:231: in 
verify_query_result_is_equal assert expected_results == actual_results E   
assert Comparing QueryTestResults (expected vs actual): E 
1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None E 
Number of rows returned (expected vs actual): 1 != 0
{noformat}

Stacktrace
{noformat}
query_test/test_kudu.py:84: in test_kudu_insert
self.run_test_case('QueryTest/kudu_insert', vector, use_db=unique_database)
common/impala_test_suite.py:455: in run_test_case
pytest.config.option.update_results, result_section='DML_RESULTS')
common/test_result_verifier.py:404: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:231: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None
E Number of rows returned (expected vs actual): 1 != 0
{noformat}

Standard Error
{noformat}
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_kudu_insert_70eff904` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_kudu_insert_70eff904`;

MainThread: Created database "test_kudu_insert_70eff904" for test ID 
"query_test/test_kudu.py::TestKuduOperations::()::test_kudu_insert[exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]"
-- executing against localhost:21000
use test_kudu_insert_70eff904;

SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000
create table tdata
  (id int primary key, valf float null, vali bigint null, valv string null,
   valb boolean null, valt tinyint null, vals smallint null, vald double null,
   ts timestamp, decimal4 decimal(9,9) null, decimal8 decimal(18,2) null,
   decimal16 decimal(38, 0) null)
  PARTITION BY RANGE (PARTITION VALUES < 10, PARTITION 10 <= VALUES < 30,
  PARTITION 30 <= VALUES) STORED AS KUDU;

-- executing against localhost:21000
insert into tdata values (1, 1, 1, 'one', true, 1, 1, 1,
  cast('1987-05-19 00:00:00' as timestamp), 0.1, 1.00, 1);

-- executing against localhost:21000
select * from tdata limit 1000;

MainThread: Comparing QueryTestResults (expected vs actual):
1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None
Number of rows returned (expected vs actual): 1 != 0
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6921) AnalysisException: Failed to load metadata for table: 'tpch_kudu.ctas_cancel' during data load

2018-04-24 Thread David Knupp (JIRA)
David Knupp created IMPALA-6921:
---

 Summary: AnalysisException: Failed to load metadata for table: 
'tpch_kudu.ctas_cancel' during data load
 Key: IMPALA-6921
 URL: https://issues.apache.org/jira/browse/IMPALA-6921
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: David Knupp


This exception seems to be consistently thrown during the data load phase. It 
appears in compute-table-stats.log.

{noformat}
2018-04-22 06:50:48,764 Thread-8:  Failed on table tpch_kudu.ctas_cancel
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/util/compute_table_stats.py",
 line 40, in compute_stats_table
result = impala_client.execute(statement)
  File 
"/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 173, in execute
handle = self.__execute_query(query_string.strip(), user=user)
  File 
"/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 339, in __execute_query
handle = self.execute_query_async(query_string, user=user)
  File 
"/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 335, in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
  File 
"/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 460, in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
ImpalaBeeswaxException: ImpalaBeeswaxException:
 INNER EXCEPTION: 
 MESSAGE: AnalysisException: Failed to load metadata for table: 
'tpch_kudu.ctas_cancel'
CAUSED BY: TableLoadingException: Error loading metadata for Kudu table 
impala::tpch_kudu.ctas_cancel
CAUSED BY: ImpalaRuntimeException: Error opening Kudu table 
'impala::tpch_kudu.ctas_cancel', Kudu error: The table does not exist: 
table_name: "impala::tpch_kudu.ctas_cancel"
{noformat}

ctas_cancel is a table that gets used by query_test/test_cancellation.py

This doesn't seem to break anything (data laoding completes and tests pass), 
but it's vexing that we part of our standard data load process produces 
exceptions in any log file.

Please feel free to mark this as invalid if this is not really an issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6914) test_mem_limit in test_admission_controller timed out waiting for query to end

2018-04-23 Thread David Knupp (JIRA)
David Knupp created IMPALA-6914:
---

 Summary: test_mem_limit in test_admission_controller timed out 
waiting for query to end
 Key: IMPALA-6914
 URL: https://issues.apache.org/jira/browse/IMPALA-6914
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: David Knupp


Error Message
{noformat}
AssertionError: Timed out waiting 60 seconds for query end assert 
(1524166375.299542 - 1524166315.228723) < 60  +  where 1524166375.299542 = 
time()
{noformat}

Stacktrace
{noformat}
custom_cluster/test_admission_controller.py:943: in test_mem_limit
{'request_pool': self.pool_name, 'mem_limit': query_mem_limit})
custom_cluster/test_admission_controller.py:837: in run_admission_test
self.end_admitted_queries(num_to_end)
custom_cluster/test_admission_controller.py:622: in end_admitted_queries
assert (time() - start_time < STRESS_TIMEOUT),\
E   AssertionError: Timed out waiting 60 seconds for query end
E   assert (1524166375.299542 - 1524166315.228723) < 60
E+  where 1524166375.299542 = time()
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6911) test_duplicate_partitions failing on recent S3 build

2018-04-23 Thread David Knupp (JIRA)
David Knupp created IMPALA-6911:
---

 Summary: test_duplicate_partitions failing on recent S3 build
 Key: IMPALA-6911
 URL: https://issues.apache.org/jira/browse/IMPALA-6911
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: David Knupp


{noformat}
Stacktrace
metadata/test_recover_partitions.py:255: in test_duplicate_partitions
assert old_length + 1 == len(result.data),\
E   AssertionError: ALTER TABLE 
test_duplicate_partitions_4b6ed438.test_recover_partitions RECOVER PARTITIONS 
failed to handle duplicate partition key values.
E   assert (3 + 1) == 3
E+  where 3 = len(['1\tp1\t-1\t1\t2B\tNOT CACHED\tNOT 
CACHED\tTEXT\tfalse\ts3a://impala-cdh5-s3-test/test-warehouse/test_duplicate_parti...st-warehouse/test_duplicate_partitions_4b6ed438.db/test_recover_partitions/i=1/p=p4',
 'Total\t\t-1\t1\t2B\t0B\t\t\t\t'])
E+where ['1\tp1\t-1\t1\t2B\tNOT CACHED\tNOT 
CACHED\tTEXT\tfalse\ts3a://impala-cdh5-s3-test/test-warehouse/test_duplicate_parti...st-warehouse/test_duplicate_partitions_4b6ed438.db/test_recover_partitions/i=1/p=p4',
 'Total\t\t-1\t1\t2B\t0B\t\t\t\t'] = 
.data
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6910) test_seq_writer (in test_compressed_formats) failed on S3 build: "SdkClientException: Data read has a different length than the expected"

2018-04-23 Thread David Knupp (JIRA)
David Knupp created IMPALA-6910:
---

 Summary: test_seq_writer (in test_compressed_formats) failed on S3 
build: "SdkClientException: Data read has a different length than the expected"
 Key: IMPALA-6910
 URL: https://issues.apache.org/jira/browse/IMPALA-6910
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: David Knupp
Assignee: Sailesh Mukil


{noformat}
Stacktrace
query_test/test_compressed_formats.py:149: in test_seq_writer
self.run_test_case('QueryTest/seq-writer', vector, unique_database)
common/impala_test_suite.py:397: in run_test_case
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:612: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:341: in __execute_query
self.wait_for_completion(handle)
beeswax/impala_beeswax.py:361: in wait_for_completion
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Disk I/O error: Error reading from HDFS file: 
s3a://impala-cdh5-s3-test/test-warehouse/tpcds.store_sales_parquet/ss_sold_date_sk=2452585/a5482dcb946b6c98-7543e0dd0004_95929617_data.0.parq
E   Error(255): Unknown error 255
E   Root cause: SdkClientException: Data read has a different length than the 
expected: dataLength=8576; expectedLength=17785; includeSkipped=true; 
in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; 
markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; 
resetCount=0
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6906) test_admission_controller.TestAdmissionController.test_memory_rejection on S3

2018-04-23 Thread David Knupp (JIRA)
David Knupp created IMPALA-6906:
---

 Summary: 
test_admission_controller.TestAdmissionController.test_memory_rejection on S3
 Key: IMPALA-6906
 URL: https://issues.apache.org/jira/browse/IMPALA-6906
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 2.13.0, Impala 3.1.0
Reporter: David Knupp
Assignee: Tim Armstrong


{noformat}
Stacktrace
custom_cluster/test_admission_controller.py:402: in test_memory_rejection
self.run_test_case('QueryTest/admission-reject-mem-estimate', vector)
common/impala_test_suite.py:444: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:560: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex: .*Per-Host Resource Estimates: Memory=90.00MB.*
E
E   ACTUAL PROFILE:
E   Query (id=9f4cdd224745b688:b89cfdc3):
E DEBUG MODE WARNING: Query profile created while running a DEBUG build of 
Impala. Use RELEASE builds to measure query performance.
E Summary:
E   Session ID: 2e4d7150362474cb:8dcbaecea87bb80
E   Session Type: BEESWAX
E   Start Time: 2018-04-21 02:01:35.027023000
E   End Time: 
E   Query Type: QUERY
E   Query State: FINISHED
E   Query Status: OK
E   Impala Version: impalad version 3.0.0-SNAPSHOT DEBUG (build 
b68e06997c1f49f6b723d78e217efddec4f56f3a)
E   User: jenkins
E   Connected User: jenkins
E   Delegated User: 
E   Network Address: 127.0.0.1:33892
E   Default Db: functional
E   Sql Statement: select min(l_comment) from tpch_parquet.lineitem
E   Coordinator: 
E   Query Options (set by configuration): 
ABORT_ON_ERROR=1,NUM_NODES=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,DISABLE_CODEGEN_ROWS_THRESHOLD=5000,MAX_MEM_ESTIMATE_FOR_ADMISSION=10485760
E   Query Options (set by configuration and planner): 
ABORT_ON_ERROR=1,NUM_NODES=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,MT_DOP=0,DISABLE_CODEGEN_ROWS_THRESHOLD=5000,MAX_MEM_ESTIMATE_FOR_ADMISSION=10485760
E   Plan: 
E   
E   Max Per-Host Resource Reservation: Memory=0B
E   Per-Host Resource Estimates: Memory=50.00MB
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6902) query_test.test_udfs.TestUdfExecution.test_native_functions_race failed during core/thrift build

2018-04-21 Thread David Knupp (JIRA)
David Knupp created IMPALA-6902:
---

 Summary: 
query_test.test_udfs.TestUdfExecution.test_native_functions_race failed during 
core/thrift build
 Key: IMPALA-6902
 URL: https://issues.apache.org/jira/browse/IMPALA-6902
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Assignee: Vuk Ercegovac


Assigning to Vuk, who authored this test as part of the patch for IMPALA-6488:

https://gerrit.cloudera.org/c/9626/

I'm not sure that this really the same failure though, so I'm not reopening 
that earlier JIRA. If I'm mistaken, please feel free to reopen/reassign as 
necessary.

 

Stacktrace
{noformat}
query_test/test_udfs.py:377: in test_native_functions_race
 assert len(errors) == 0
E assert 1 == 0
E + where 1 = len([ImpalaBeeswaxException()])
Standard Output
ImpalaBeeswaxException:
 INNER EXCEPTION: 
 MESSAGE: ImpalaRuntimeException: Error making 'alterDatabase' RPC to Hive 
Metastore: 
CAUSED BY: NoSuchObjectException: test_native_functions_race_fc9680e5: 
Transaction rolled back due to failure during commit{noformat}
 

 

Standard Error
{noformat}
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_native_functions_race_fc9680e5` CASCADE;
SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_native_functions_race_fc9680e5`;
MainThread: Created database "test_native_functions_race_fc9680e5" for test ID 
"query_test/test_udfs.py::TestUdfExecution::()::test_native_functions_race[exec_option:
 {'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': True} | 
table_format: text/none]"
-- connecting to: localhost:21000
-- executing against localhost:21000
create function test_native_functions_race_fc9680e5.use_it(string) returns 
string
 LOCATION '/test-warehouse/libTestUdfs.so'
 SYMBOL='_Z8IdentityPN10impala_udf15FunctionContextERKNS_9StringValE';
{noformat}
 

 

>From catalogd log:

 
{noformat}
I0420 14:54:00.014191 19585 jni-util.cc:230] 
org.apache.impala.common.ImpalaRuntimeException: Error making 'alterDatabase' 
RPC to Hive Metastore: at 
org.apache.impala.service.CatalogOpExecutor.applyAlterDatabase(CatalogOpExecutor.java:2770)
 at 
org.apache.impala.service.CatalogOpExecutor.dropFunction(CatalogOpExecutor.java:1521)
 at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:307)
 at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146) Caused 
by: NoSuchObjectException(message:test_native_functions_race_fc9680e5: 
Transaction rolled back due to failure during commit) at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result$alter_database_resultStandardScheme.read(ThriftHiveMetastore.java:20111)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result$alter_database_resultStandardScheme.read(ThriftHiveMetastore.java:20088)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result.read(ThriftHiveMetastore.java:20030)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_database(ThriftHiveMetastore.java:814)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_database(ThriftHiveMetastore.java:800)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alterDatabase(HiveMetaStoreClient.java:1420)
 at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:101)
 at com.sun.proxy.$Proxy5.alterDatabase(Unknown Source) at 
org.apache.impala.service.CatalogOpExecutor.applyAlterDatabase(CatalogOpExecutor.java:2768)
 ... 3 more I0420 14:54:01.581200 3230 catalog-server.cc:245] A catalog update 
with 5 entries is assembled. Catalog version: 14385 Last sent catalog version: 
14377 I0420 14:54:01.582358 3225 catalog-server.cc:480] Collected deletion: 
FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, 
function_name:other)(other(FLOAT)), v ersion=14387, original size=310, 
compressed size=265 I0420 14:54:01.582433 3225 catalog-server.cc:480] Collected 
deletion: FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, 
function_name:other)(other(FLOAT)), v ersion=14389, original size=310, 
compressed size=265 I0420 14:54:01.582499 3225 catalog-server.cc:480] Collected 
deletion: FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, 
function_name:other)(other(FLOAT)), v ersion=14391, original size=310, 
compressed size=265 I0420 14:54:01.582567 3225 catalog-server.cc:480] Colle

[jira] [Resolved] (IMPALA-6761) delimited-text-parser-test fails in ASAN build

2018-04-07 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6761.
-
   Resolution: Fixed
Fix Version/s: Impala 2.13.0

> delimited-text-parser-test fails in ASAN build
> --
>
> Key: IMPALA-6761
> URL: https://issues.apache.org/jira/browse/IMPALA-6761
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zach Amsden
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0
>
>
> Hi [~zamsden], could this be related to your recent change to fix IMPALA-6389 
> ?
> {noformat}
> 03:26:07 [ RUN  ] DelimitedTextParser.SpecialDelimiters
> 03:26:07 =
> 03:26:07 ==14342==ERROR: AddressSanitizer: stack-buffer-overflow on address 
> 0x7fff33da29c1 at pc 0x0141f344 bp 0x7fff33da1d20 sp 0x7fff33da1d18
> 03:26:07 READ of size 1 at 0x7fff33da29c1 thread T0
> 03:26:07 #0 0x141f343 in 
> impala::DelimitedTextParser::ReturnCurrentColumn() const 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.h:114:39
> 03:26:07 #1 0x141bf49 in impala::Status 
> impala::DelimitedTextParser::AddColumn(long, char**, int*, 
> impala::FieldLocation*) 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.inline.h:62:7
> 03:26:07 #2 0x1419517 in 
> impala::DelimitedTextParser::ParseFieldLocations(int, long, char**, 
> char**, impala::FieldLocation*, int*, int*, char**) 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.cc:194:43
> 03:26:07 #3 0x13f8ed7 in 
> impala::Validate(impala::DelimitedTextParser*, std::string const&, int, 
> char, int, int) 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:57:15
> 03:26:07 #4 0x13fb274 in 
> impala::DelimitedTextParser_SpecialDelimiters_Test::TestBody() 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:211:3
> 03:26:07 #5 0x3f3fc52 in void 
> testing::internal::HandleExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*) 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f3fc52)
> 03:26:07 #6 0x3f375a9 in testing::Test::Run() 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f375a9)
> 03:26:07 #7 0x3f376f7 in testing::TestInfo::Run() 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f376f7)
> 03:26:07 #8 0x3f377d4 in testing::TestCase::Run() 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f377d4)
> 03:26:07 #9 0x3f38a57 in testing::internal::UnitTestImpl::RunAllTests() 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f38a57)
> 03:26:07 #10 0x3f38d32 in testing::UnitTest::Run() 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f38d32)
> 03:26:07 #11 0x13fb927 in main 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:221:192
> 03:26:07 #12 0x7fdc3ec02cdc in __libc_start_main 
> (/lib64/libc.so.6+0x1ecdc)
> 03:26:07 #13 0x13064a0 in _start 
> (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x13064a0)
> 03:26:07 
> 03:26:07 Address 0x7fff33da29c1 is located in stack of thread T0 at offset 33 
> in frame
> 03:26:07 #0 0x13fa74f in 
> impala::DelimitedTextParser_SpecialDelimiters_Test::TestBody() 
> /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:149
> 03:26:07 
> 03:26:07   This frame has 56 object(s):
> 03:26:07 [32, 33) 'is_materialized_col' <== Memory access at offset 33 
> overflows this variable
> 03:26:07 [48, 208) 'tuple_delim_parser'
> 03:26:07 [272, 432) 'nul_delim_parser'
> 03:26:07 [496, 656) 'nul_field_parser'
> 03:26:07 [720, 728) 'ref.tmp'
> 03:26:07 [752, 753) 'ref.tmp4'
> 03:26:07 [768, 776) 'ref.tmp5'
> 03:26:07 [800, 801) 'ref.tmp6'
> 03:26:07 [816, 824) 'ref.tmp7'
> 03:26:07 [848, 849) 'ref.tmp8'
> 03:26:07 [864, 872) 'ref.tmp9'
> 03:26:07 [896, 897) 'ref.tmp10'
> 03:26:07 [912, 920) 'ref.tmp11'
> 03:26:07 [944, 945) 'ref.tmp12'
> 03:26:07 [960, 968) 'ref.tmp13'
> 03:26:07   

[jira] [Created] (IMPALA-6814) query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters

2018-04-05 Thread David Knupp (JIRA)
David Knupp created IMPALA-6814:
---

 Summary: 
query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on 
remote clusters
 Key: IMPALA-6814
 URL: https://issues.apache.org/jira/browse/IMPALA-6814
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.12.0
Reporter: David Knupp


It looks like {{localhost}} is hardcoded in the test verification.
h3. Stacktrace

query_test/test_queries.py:161: in test_strict_mode
 self.run_test_case('QueryTest/strict-mode', vector)
common/impala_test_suite.py:427: in run_test_case
 self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:300: in __verify_results_and_errors
 replace_filenames_with_placeholder)
common/test_result_verifier.py:317: in verify_raw_results
 verify_errors(expected_errors, actual_errors)
common/test_result_verifier.py:274: in verify_errors
 VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
common/test_result_verifier.py:231: in verify_query_result_is_equal
 assert expected_results == actual_results
E assert Comparing QueryTestResults (expected vs actual):
[...]
E row_regex: .*Error parsing row: file: 
hdfs://*{color:#FF}localhost{color}*:20500/.* before offset: \d+ != 'Error 
parsing row: file: 
hdfs://impala-ubuntu1404-cluster-1.vpc.cloudera.com:8020/test-warehouse/overflow/overflow.txt,
 before offset: 454'
E row_regex: .*Error parsing row: file: 
hdfs://*{color:#FF}localhost{color}*:20500/.* before offset: \d+ != 'Error 
parsing row: file: 
hdfs://impala-ubuntu1404-cluster-1.vpc.cloudera.com:8020/test-warehouse/overflow/overflow.txt,
 before offset: 454'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6810) query_test::test_runtime_filters.py::test_row_filters fails when run against an external cluster

2018-04-05 Thread David Knupp (JIRA)
David Knupp created IMPALA-6810:
---

 Summary: query_test::test_runtime_filters.py::test_row_filters 
fails when run against an external cluster
 Key: IMPALA-6810
 URL: https://issues.apache.org/jira/browse/IMPALA-6810
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Assignee: Alexander Behm


Presumably this test has been passing when run against the local mini-cluster. 
When run against an external cluster, however, the test fails with an 
AssertionError because the exception string is different than expected.

The expected string is:

_ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: Rejected query from pool 
{color:red}default-pool{color}: minimum memory reservation is greater than 
memory available to the query for buffer reservations. Increase the 
buffer_pool_limit to 290.00 MB. See the query profile for more information 
about the per-node memory requirements._

The actual string is:

ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: Rejected query from pool 
{color:red}root.jenkins{color}: minimum memory reservation is greater than 
memory available to the query for buffer reservations. Increase the 
buffer_pool_limit to 290.00 MB. See the query profile for more information 
about the per-node memory requirements.

{noformat}
Stacktrace
query_test/test_runtime_filters.py:168: in test_row_filters
test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)})
common/impala_test_suite.py:401: in run_test_case
self.__verify_exceptions(test_section['CATCH'], str(e), use_db)
common/impala_test_suite.py:279: in __verify_exceptions
(expected_str, actual_str)
E   AssertionError: Unexpected exception string. Expected: 
ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: Rejected query from pool 
default-pool: minimum memory reservation is greater than memory available to 
the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. 
See the query profile for more information about the per-node memory 
requirements.
E   Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: Rejected query from pool 
root.jenkins: minimum memory reservation is greater than memory available to 
the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. 
See the query profile for more information about the per-node memory 
requirements.
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6753) Update external Hadoop ecosystem versions

2018-04-05 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6753.
-
Resolution: Fixed

> Update external Hadoop ecosystem versions
> -
>
> Key: IMPALA-6753
> URL: https://issues.apache.org/jira/browse/IMPALA-6753
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Priority: Major
>
> Analogous to IMPALA-6272
> Updating the external Hadoop components on the mini-cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6808) Impala python code should be installed into infra/python/env as packages

2018-04-04 Thread David Knupp (JIRA)
David Knupp created IMPALA-6808:
---

 Summary: Impala python code should be installed into 
infra/python/env as packages
 Key: IMPALA-6808
 URL: https://issues.apache.org/jira/browse/IMPALA-6808
 Project: IMPALA
  Issue Type: Improvement
  Components: Clients, Infrastructure
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: David Knupp
Assignee: David Knupp


Impala/infra/python/env is the environment where necessary upstream python 
libraries and packages get installed -- e.g., the packages listed in 
https://github.com/apache/impala/blob/master/infra/python/deps/requirements.txt 
and other similar files.

Impala's own internal python code (like the impala-shell, or the common test 
libraries that we rely upon) should be made available the same way -- as actual 
packages installed into the environment -- rather than resroting to 
PYTHONPATH/sys.path.append(foo) sleight-of-hand, performed by such as 
bin/set-pythonpath.sh.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6753) Update external Hadoop ecosystem versions

2018-03-27 Thread David Knupp (JIRA)
David Knupp created IMPALA-6753:
---

 Summary: Update external Hadoop ecosystem versions
 Key: IMPALA-6753
 URL: https://issues.apache.org/jira/browse/IMPALA-6753
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: David Knupp


Analogous to IMPALA-6272

Updating the external Hadoop components on the mini-cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-6716) ImpalaShell should not rely on global access to parsed command line options

2018-03-22 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp closed IMPALA-6716.
---
   Resolution: Fixed
Fix Version/s: Impala 3.0

> ImpalaShell should not rely on global access to parsed command line options
> ---
>
> Key: IMPALA-6716
> URL: https://issues.apache.org/jira/browse/IMPALA-6716
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
> Fix For: Impala 3.0
>
>
> A recent patch to address a problem line breaks in LDAP passwords 
> (IMPALA-6610) can, in rare instances (e.g., when running the shell as an 
> installed python package), result in an exception being thrown if the call to 
> {{_connect()}} fails.
> {noformat}
> $ impala-shell -i foo
> Starting Impala Shell without Kerberos authentication
> Traceback (most recent call last):
>  File "/home/systest/shellenv/bin/impala-shell", line 11, in 
>  load_entry_point('impala-shell', 'console_scripts', 'impala-shell')()
>  File 
> "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
> line 1588, in main
>  shell = ImpalaShell(options, query_options)
>  File 
> "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
> line 209, in __init__
>  self.do_connect(options.impalad)
>  File 
> "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
> line 755, in do_connect
>  self._connect()
>  File 
> "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
> line 821, in _connect
>  if options.ldap_password_cmd and \
> NameError: global name 'options' is not defined
> {noformat}
> The error is actually in the connection failure handling code:
> https://github.com/apache/impala/blob/master/shell/impala_shell.py#L821
> The problem is that the shell instance should not assume continued access to 
> the options returned from {{parser.parse_args().}} In most cases, we store 
> those values directly as member variables of the shell. We should do the same 
> with all LDAP-related values, and then access those member variables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6716) ImpalaShell should not rely on global access to parsed command line options

2018-03-21 Thread David Knupp (JIRA)
David Knupp created IMPALA-6716:
---

 Summary: ImpalaShell should not rely on global access to parsed 
command line options
 Key: IMPALA-6716
 URL: https://issues.apache.org/jira/browse/IMPALA-6716
 Project: IMPALA
  Issue Type: Bug
  Components: Clients
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: David Knupp


A recent patch to address a problem line breaks in LDAP passwords (IMPALA-6610) 
can, in rare instances, result in an exception being thrown if the call to 
{{_connect()}} fails.

{noformat}
$ impala-shell -i foo
Starting Impala Shell without Kerberos authentication
Traceback (most recent call last):
 File "/home/systest/shellenv/bin/impala-shell", line 11, in 
 load_entry_point('impala-shell', 'console_scripts', 'impala-shell')()
 File 
"/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
line 1588, in main
 shell = ImpalaShell(options, query_options)
 File 
"/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
line 209, in __init__
 self.do_connect(options.impalad)
 File 
"/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
line 755, in do_connect
 self._connect()
 File 
"/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", 
line 821, in _connect
 if options.ldap_password_cmd and \
NameError: global name 'options' is not defined
{noformat}

The error is actually in the connection failure handling code.

The problem is that the shell instance should not assume continued access to 
the options returned from {{parser.parse_args().}} In most cases, we store 
those values directly as member variables of the shell. We should do the same 
with all LDAP-related values, and then access those member variables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6702) Consider using standard pip client to download python dependencies

2018-03-19 Thread David Knupp (JIRA)
David Knupp created IMPALA-6702:
---

 Summary: Consider using standard pip client to download python 
dependencies
 Key: IMPALA-6702
 URL: https://issues.apache.org/jira/browse/IMPALA-6702
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: David Knupp


Impala currently uses a hand-rolled client to download python dependencies:
 
[https://github.com/apache/impala/blob/master/infra/python/deps/pip_download.py]

This client skips the install step, adds automatic retries, and avoids trying 
to use wheel packages. However, the standard pip client does all of these 
things as well. Sometimes, upstream changes to PyPI can cause this custom 
client to break, most recently in IMPALA-6682 and IMPALA-6695. Perhaps Impala 
should consider dropping pip_download.py in favor of using the public pip 
client. (Kudu-python presents a problem though – see below.)

A quick test did show that there were some minor differences. Using 
pip_download.py vs. pip v9.0.2 to process the various requirements.txt files at 
[https://github.com/apache/impala/tree/master/infra/python/deps]. The pip 
command used was:
{noformat}
$ pip download --dest=$IMPALA_HOME/infra/python/deps --no-binary=:all: 
--no-deps -r $IMPALA_HOME/infra/python/deps/*requirements.txt
{noformat}
For requirements.txt:
 * pip v9.0.2
 ** Ignores readline: markers 'sys_platform == "darwin"' don't match your 
environment
 ** Downloads prettytable-0.7.2.zip
 ** Downloads pyparsing-2.0.3.zip
 * pip_download.py
 ** Downloads readline-6.2.4.1.tar.gz
 ** Downloads prettytable-0.7.2.tar.bz2
 ** Downloads pyparsing-2.0.3.tar.gz

For compiled-requirements.txt
 * pip v9.0.2
 ** Downloads Cython-0.23.4.zip
 ** Downloads numpy-1.10.4.zip
 * pip_download.py
 ** Downloads Cython-0.23.4.tar.gz
 ** Downloads numpy-1.10.4.tar.gz

For adls-requirements.txt
 * no difference

 

Unfortunately, the kudu-requirements.txt, which only contains one dependency 
({{kudu-python==1.2.0}}), is problematic. Even when using the {{download}} 
command with pip, setup.py tries to install the package:

 
{noformat}
Using cached kudu-python-1.2.0.tar.gz
 Saved ./kudu-python-1.2.0.tar.gz
 Complete output from command python setup.py egg_info:
 Cannot find installed kudu client.

Command "python setup.py egg_info" failed with error code 1 in 
/tmp/pip-build-N1Mu7y/kudu-python/
{noformat}
We would need to either figure out why this happening (maybe it's a bug in the 
Kudu setup.py file), or else find a workaround for this one package.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6570) Remote cluster test fails on SLES12/SP3 -- "Kudu not supported on this operating system"

2018-02-23 Thread David Knupp (JIRA)
David Knupp created IMPALA-6570:
---

 Summary: Remote cluster test fails on SLES12/SP3 -- "Kudu not 
supported on this operating system"
 Key: IMPALA-6570
 URL: https://issues.apache.org/jira/browse/IMPALA-6570
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Assignee: Thomas Tauber-Marshall


The cluster has Kudu installed. When trying to load the standard data for the 
Impala functional test suite onto the cluster, I see:

{noformat}
INSERT INTO TABLE tpch_kudu.lineitem SELECT * FROM tpch.lineitem

Data Loading from Impala failed with error: ImpalaBeeswaxException:
 INNER EXCEPTION: 
 MESSAGE: [Errno 104] Connection reset by peer
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/bin/load-data.py",
 line 179, in exec_impala_query_from_file
result = impala_client.execute(query)
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py",
 line 173, in execute
handle = self.__execute_query(query_string.strip(), user=user)
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py",
 line 341, in __execute_query
self.wait_for_completion(handle)
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py",
 line 353, in wait_for_completion
query_state = self.get_state(query_handle)
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py",
 line 370, in get_state
return self.__do_rpc(lambda: self.imp_service.get_state(query_handle))
  File 
"/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py",
 line 467, in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(u), u)
ImpalaBeeswaxException: ImpalaBeeswaxException:
 INNER EXCEPTION: 
 MESSAGE: [Errno 104] Connection reset by peer
{noformat}

In /var/log/impalad/impalad.INFO, I see:
{noformat}
I0222 21:47:03.208997  9522 init.cc:241] Process ID: 9522
I0222 21:47:04.432262  9522 status.cc:53] Kudu is not supported on this 
operating system.
@   0x961162  impala::Status::Status()
@   0x9645c5  impala::CheckKuduAvailability()
@   0x964602  impala::KuduIsAvailable()
@   0x95c761  impala::InitCommonRuntime()
@   0xbce0b3  ImpaladMain()
@   0x8e8523  main
@ 0x7f70fdce96e5  __libc_start_main
@   0x92f529  _start
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6486) INVALIDATE METADATA may hang after statestore restart

2018-02-09 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6486.
-
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

Confirmed via manual test that the patch was successful.

> INVALIDATE METADATA may hang after statestore restart
> -
>
> Key: IMPALA-6486
> URL: https://issues.apache.org/jira/browse/IMPALA-6486
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.11.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Blocker
>  Labels: hang
> Fix For: Impala 2.12.0
>
>
> In some cases, INVALIDATE METADATA may hang if it is run after a statestore 
> restart.  This was caused by the fix for IMPALA-5058. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6350) "split-hbase.sh failed during data load for CDH5.14.0 exhaustive"

2017-12-21 Thread David Knupp (JIRA)
David Knupp created IMPALA-6350:
---

 Summary: "split-hbase.sh failed during data load for CDH5.14.0 
exhaustive"
 Key: IMPALA-6350
 URL: https://issues.apache.org/jira/browse/IMPALA-6350
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.11.0
Reporter: David Knupp


This affects test setup, but not  the product.
{noformat}
21:24:39 Splitting HBase (logging to 
/data/jenkins/workspace/impala-cdh5-2.11.0_5.14.0-exhaustive/repos/Impala/logs/data_loading/create-hbase.log)...
 
21:27:32 FAILED (Took: 2 min 53 sec)
21:27:32 
'/data/jenkins/workspace/impala-cdh5-2.11.0_5.14.0-exhaustive/repos/Impala/testdata/bin/split-hbase.sh'
 failed. Tail of log:
21:27:32at 
org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2733)
21:27:32at 
org.apache.hadoop.hbase.client.HBaseAdmin.splitRegion(HBaseAdmin.java:2693)
21:27:32at 
org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2714)
21:27:32at 
org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2703)
21:27:32at 
org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.performAssigment(HBaseTestDataRegionAssigment.java:112)
21:27:32at 
org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:312)
21:27:32 Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
 org.apache.hadoop.hbase.NotServingRegionException: Region 
functional_hbase.alltypessmall,1,1513574812814.1d8e718f14ccea2c2f5c6724f023558d.
 is not online on localhost,16201,1513565166378
21:27:32at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2997)
21:27:32at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1055)
21:27:32at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.splitRegion(RSRpcServices.java:1853)
21:27:32at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22247)
21:27:32at 
org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2191)
21:27:32at 
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
21:27:32at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:183)
21:27:32at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:163)
21:27:32 
21:27:32at 
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272)
21:27:32at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
21:27:32at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
21:27:32at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.splitRegion(AdminProtos.java:23173)
21:27:32at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.split(ProtobufUtil.java:1908)
21:27:32... 6 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (IMPALA-4641) Loading tpch nested test data to a remote cluster silently fails

2017-12-20 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp closed IMPALA-4641.
---
Resolution: Cannot Reproduce

> Loading tpch nested test data to a remote cluster silently fails
> 
>
> Key: IMPALA-4641
> URL: https://issues.apache.org/jira/browse/IMPALA-4641
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.7.0
>Reporter: David Knupp
>Priority: Critical
>  Labels: test-infra
>
> Running the Impala data load scripts doesn't always produce the same results 
> on a remote cluster as on the local mini-cluster. In this case, 
> {{tpch_nested_parquet}} data is never loaded.
> {noformat}
> [impala-debian78-test-cluster-4.vpc.cloudera.com:21000] > show table stats 
> tpch_nested_parquet.supplier;
> Query: show table stats tpch_nested_parquet.supplier
> +---++--+--+---+-+---+-+
> | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format  | 
> Incremental stats | Location  
>   |
> +---++--+--+---+-+---+-+
> | 0 | 1  | 356B | NOT CACHED   | NOT CACHED| PARQUET | false  
>| 
> hdfs://impala-debian78-test-cluster-1.vpc.cloudera.com:8020/user/hive/warehouse/tpch_nested_parquet.db/supplier
>  |
> +---++--+--+---+-+---+-+
> Fetched 1 row(s) in 0.01s
> {noformat}
> Compare this to the local minicluster, after running data load.
> {noformat}
> [localhost.localdomain:21000] > show table stats tpch_nested_parquet.supplier;
> Query: show table stats tpch_nested_parquet.supplier
> +---++-+--+---+-+---+---+
> | #Rows | #Files | Size| Bytes Cached | Cache Replication | Format  | 
> Incremental stats | Location  
> |
> +---++-+--+---+-+---+---+
> | 1 | 1  | 43.00MB | NOT CACHED   | NOT CACHED| PARQUET | 
> false | 
> hdfs://localhost:20500/test-warehouse/tpch_nested_parquet.db/supplier |
> +---++-+--+---+-+---+---+
> Fetched 1 row(s) in 4.90s
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-6341) Cascade of test failures during ASAN build, probably related to Rpc exception: N6apache6thrift9transport19TTransportExceptionE

2017-12-18 Thread David Knupp (JIRA)
David Knupp created IMPALA-6341:
---

 Summary: Cascade of test failures during ASAN build, probably 
related to Rpc exception: N6apache6thrift9transport19TTransportExceptionE 
 Key: IMPALA-6341
 URL: https://issues.apache.org/jira/browse/IMPALA-6341
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Priority: Critical


Several build failures were seen during a recent ASAN test run, probably 
stemming from initial error:
{noformat}
query_test/test_queries.py:62: in test_analytic_fns
self.run_test_case('QueryTest/analytic-fns', vector)
common/impala_test_suite.py:395: in run_test_case
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:610: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: ExecQueryFInstances rpc 
query_id=834130b14e576a27:6b6389e0 failed: RPC Error: Client for 
ec2-m2-4xlarge-centos-6-4-1d24.vpc.cloudera.com:22002 hit an unexpected 
exception: ECONNRESET, type: N6apache6thrift9transport19TTransportExceptionE, 
rpc: N6impala26TExecQueryFInstancesResultE, send: done
{noformat}

Possibly the same as IMPALA-5692 or IMPALA-5999? Not sure because there was no 
minidump this time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-6338) test_profile_fragment_instances failing on Isilon build

2017-12-18 Thread David Knupp (JIRA)
David Knupp created IMPALA-6338:
---

 Summary: test_profile_fragment_instances failing on Isilon build
 Key: IMPALA-6338
 URL: https://issues.apache.org/jira/browse/IMPALA-6338
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Priority: Critical


Stack trace:
{noformat}
query_test/test_observability.py:123: in test_profile_fragment_instances
assert results.runtime_profile.count("HDFS_SCAN_NODE") == 12
E   assert 11 == 12
E+  where 11 = ('HDFS_SCAN_NODE')
E+where  = 'Query 
(id=ae4cee91aafc5c6c:11b545c6):\n  DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil...ontextSwitches: 0 (0)\n   - 
TotalRawHdfsReadTime(*): 5s784ms\n   - TotalReadThroughput: 17.33 
MB/sec\n'.count
E+  where 'Query (id=ae4cee91aafc5c6c:11b545c6):\n  DEBUG MODE 
WARNING: Query profile created while running a DEBUG buil...ontextSwitches: 0 
(0)\n   - TotalRawHdfsReadTime(*): 5s784ms\n   - 
TotalReadThroughput: 17.33 MB/sec\n' = 
.runtime_profile
{noformat}

Query:
{noformat}
with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 
1) a
join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey;
{noformat}

Summary:
{noformat}
Operator #Hosts  Avg Time  Max Time  #Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail

05:AGGREGATE  1   0.000ns   0.000ns  1   1   28.00 KB   
10.00 MB  FINALIZE  
04:HASH JOIN  1  15.000ms  15.000ms  0   1  141.06 MB   
17.00 MB  INNER JOIN, BROADCAST 
|--08:EXCHANGE1   4s153ms   4s153ms  2.00M   2.00M  0   
   0  UNPARTITIONED 
|  07:EXCHANGE1   3s783ms   3s783ms  2.00M   2.00M  0   
   0  UNPARTITIONED 
|  01:UNION   3  17.000ms  28.001ms  3.03M   2.00M  0   
   0
|  |--03:SCAN HDFS3   0.000ns   0.000ns  0   6.00M  0   
   176.00 MB  tpch.lineitem 
|  02:SCAN HDFS   3   6s133ms   6s948ms  3.03M   6.00M   24.02 MB   
   176.00 MB  tpch.lineitem 
06:EXCHANGE   1   5s655ms   5s655ms  1   1  0   
   0  UNPARTITIONED 
00:SCAN HDFS  3   4s077ms   6s207ms  2   1   16.05 MB   
   176.00 MB  tpch.lineitem a   
{noformat}

Plan:
{noformat}

Max Per-Host Resource Reservation: Memory=17.00MB
Per-Host Resource Estimates: Memory=379.00MB

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=27.00MB mem-reservation=17.00MB
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B
|
05:AGGREGATE [FINALIZE]
|  output: count(*)
|  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
|  tuple-ids=7 row-size=8B cardinality=1
|
04:HASH JOIN [INNER JOIN, BROADCAST]
|  hash predicates: a.l_orderkey = -1 * l_orderkey
|  fk/pk conjuncts: assumed fk/pk
|  runtime filters: RF000[bloom] <- -1 * l_orderkey
|  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB
|  tuple-ids=0,4 row-size=16B cardinality=1
|
|--08:EXCHANGE [UNPARTITIONED]
|  |  mem-estimate=0B mem-reservation=0B
|  |  tuple-ids=4 row-size=8B cardinality=200
|  |
|  F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=0B mem-reservation=0B
|  07:EXCHANGE [UNPARTITIONED]
|  |  limit: 200
|  |  mem-estimate=0B mem-reservation=0B
|  |  tuple-ids=4 row-size=8B cardinality=200
|  |
|  F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|  Per-Host Resources: mem-estimate=176.00MB mem-reservation=0B
|  01:UNION
|  |  pass-through-operands: all
|  |  limit: 200
|  |  mem-estimate=0B mem-reservation=0B
|  |  tuple-ids=4 row-size=8B cardinality=200
|  |
|  |--03:SCAN HDFS [tpch.lineitem, RANDOM]
|  | partitions=1/1 files=1 size=718.94MB
|  | stored statistics:
|  |   table: rows=6001215 size=718.94MB
|  |   columns: all
|  | extrapolated-rows=disabled
|  | mem-estimate=176.00MB mem-reservation=0B
|  | tuple-ids=3 row-size=8B cardinality=6001215
|  |
|  02:SCAN HDFS [tpch.lineitem, RANDOM]
| partitions=1/1 files=1 size=718.94MB
| stored statistics:
|   table: rows=6001215 size=718.94MB
|   columns: all
| extrapolated-rows=disabled
| mem-estimate=176.00MB mem-reservation=0B
| tuple-ids=2 row-size=8B cardinality=6001215
|
06:EXCHANGE [UNPARTITIONED]
|  limit: 1
|  mem-estimate=0B mem-reservation=0B
|  tuple-ids=0 row-size=8B cardinality=1
|
F00:P

[jira] [Created] (IMPALA-6334) test_compute_stats_tablesample failing on Isilon builds

2017-12-15 Thread David Knupp (JIRA)
David Knupp created IMPALA-6334:
---

 Summary: test_compute_stats_tablesample failing on Isilon builds
 Key: IMPALA-6334
 URL: https://issues.apache.org/jira/browse/IMPALA-6334
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.12.0
Reporter: David Knupp
Priority: Critical


MainThread: Comparing QueryTestResults (expected vs actual):
Expected:
{noformat}
3660,3660,12,regex:.*B,'NOT CACHED','NOT 
CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart'
{noformat}
Actual:
{noformat}
3661,3661,12,'238.68KB','NOT CACHED','NOT 
CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart'
{noformat}

Stacktrace
{noformat}
self = 
vector = 
unique_database = 'test_compute_stats_tablesample_16dd5daf'


@CustomClusterTestSuite.with_args(impalad_args=('--enable_stats_extrapolation=true'))
def test_compute_stats_tablesample(self, vector, unique_database):
  # Create a partitioned and unpartitioned text table. Use the existing 
files from
  # functional.alltypes as data because those have a known, stable file 
size. This
  # test is sensitive to changes in file sizes across test runs because the 
sampling
  # is file based. Creating test tables with INSERT does not guarantee that 
the same
  # file sample is selected across test runs, even with REPEATABLE.

  # Create partitioned test table. External to avoid dropping files from 
alltypes.
  part_test_tbl = unique_database + ".alltypes"
  self.client.execute(
"create external table %s like functional.alltypes" % part_test_tbl)
  alltypes_loc = self._get_table_location("functional.alltypes", vector)
  for m in xrange(1, 13):
part_loc = path.join(alltypes_loc, "year=2009/month=%s" % m)
self.client.execute(
  "alter table %s add partition (year=2009,month=%s) location '%s'"
  % (part_test_tbl, m, part_loc))

  # Create unpartitioned test table.
  nopart_test_tbl = unique_database + ".alltypesnopart"
  self.client.execute("drop table if exists %s" % nopart_test_tbl)
  self.client.execute(
"create table %s like functional.alltypesnopart" % nopart_test_tbl)
  nopart_test_tbl_loc = self._get_table_location(nopart_test_tbl, vector)
  # Remove NameNode prefix and first '/' because PyWebHdfs expects that
  if nopart_test_tbl_loc.startswith(NAMENODE):
nopart_test_tbl_loc = nopart_test_tbl_loc[len(NAMENODE)+1:]
  for m in xrange(1, 13):
src_part_loc = alltypes_loc + "/year=2009/month=%s" % m
# Remove NameNode prefix and first '/' because PyWebHdfs expects that
if src_part_loc.startswith(NAMENODE): src_part_loc = 
src_part_loc[len(NAMENODE)+1:]
file_names = self.filesystem_client.ls(src_part_loc)
for f in file_names:
  self.filesystem_client.copy(path.join(src_part_loc, f),
  path.join(nopart_test_tbl_loc, f))
  self.client.execute("refresh %s" % nopart_test_tbl)

> self.run_test_case('QueryTest/compute-stats-tablesample', vector, 
> unique_database)

custom_cluster/test_stats_extrapolation.py:84: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
common/impala_test_suite.py:424: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:297: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:404: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

expected_results = 
actual_results = 

def verify_query_result_is_equal(expected_results, actual_results):
  assert_args_not_none(expected_results, actual_results)
> assert expected_results == actual_results
E assert Comparing QueryTestResults (expected vs actual):
E   3660,3660,12,regex:.*B,'NOT CACHED','NOT 
CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart'
 != 3661,3661,12,'238.68KB','NOT CACHED','NOT 
CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart'

common/test_result_verifier.py:231: AssertionError
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IMPALA-6317) Expose -cmake_only flag to buildall.sh

2017-12-12 Thread David Knupp (JIRA)
David Knupp created IMPALA-6317:
---

 Summary: Expose -cmake_only flag to buildall.sh
 Key: IMPALA-6317
 URL: https://issues.apache.org/jira/browse/IMPALA-6317
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 2.11.0
Reporter: David Knupp
Priority: Minor


Impala/bin/make_impala.sh has a {{-cmake_only}} command line option:
{noformat}
-cmake_only)
  CMAKE_ONLY=1
{noformat}

Passing this flag means that makefiles only will be generated during the build. 
However, this flag is not provided in buildall.sh (the caller of 
make_impala.sh) which effectively renders it useless.

It turns out that if one has no intention of running the Impala cluster locally 
(e.g., as when trying to build just enough of the toolchain and dev environment 
to run the data load scripts for loading data onto a remote cluster) then being 
able to only generate makefiles is a useful thing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-6306) Impalad becomes unreachable trying to load tpch_nested_parquet data

2017-12-11 Thread David Knupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6306.
-
Resolution: Not A Bug

> Impalad becomes unreachable trying to load tpch_nested_parquet data
> ---
>
> Key: IMPALA-6306
> URL: https://issues.apache.org/jira/browse/IMPALA-6306
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.11.0
>Reporter: David Knupp
>
> I've been trying (unsuccessfully) to create the tpch_nested_parquet database 
> on a remote cluster using Impala's standard data load scripts. I finally 
> confirmed that I can complete the data load process if I simply comment out 
> the lines from {{testdata/bin/create-load-data.sh}} that calls load_nested.py:
> {noformat}
> # run-step "Loading nested data" load-nested.log \
> #   ${IMPALA_HOME}/testdata/bin/load_nested.py ${LOAD_NESTED_ARGS:-}
> {noformat}
> With the all of the other data completely loaded, I tried to run 
> load_nested.py by hand, and saw this error:
> {noformat}
> systest@remote-joe:~$ Impala/testdata/bin/load_nested.py --cm-host 
> impala-dataload-testing-1.vpc.cloudera.com
> 2017-12-10 13:45:12,663 INFO:db_connection[234]:Creating database 
> tpch_nested_parquet
> 2017-12-10 13:45:12,965 INFO:load_nested[98]:Creating temp orders (chunk 1 of 
> 1)
> 2017-12-10 13:45:33,724 INFO:load_nested[128]:Creating temp customers (chunk 
> 1 of 1)
> Traceback (most recent call last):
>   File "Impala/testdata/bin/load_nested.py", line 320, in 
> load()
>   File "Impala/testdata/bin/load_nested.py", line 130, in load
> impala.execute("CREATE TABLE tmp_customer_string AS " + tmp_customer_sql)
>   File "/data1/systest/Impala/tests/comparison/db_connection.py", line 206, 
> in execute
> return self._cursor.execute(sql, *args, **kwargs)
>   File 
> "/data1/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 304, in execute
> self._wait_to_finish()  # make execute synchronous
>   File 
> "/data1/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 380, in _wait_to_finish
> raise OperationalError(resp.errorMessage)
> impala.error.OperationalError: Cancelled due to unreachable impalad(s): 
> impala-dataload-testing-2.vpc.cloudera.com:22000
> {noformat}
> From the impalad log:
> {noformat}
> I1210 13:45:12.356262 17040 Frontend.java:909] Compiling query: DESCRIBE 
> tpch_nested_parquet.part
> I1210 13:45:12.358700 17040 Frontend.java:948] Compiled query.
> I1210 13:45:12.358832 17040 jni-util.cc:211] 
> org.apache.impala.common.AnalysisException: Could not resolve path: 
> 'tpch_nested_parquet.part'
> at org.apache.impala.analysis.Analyzer.resolvePath(Analyzer.java:800)
> at org.apache.impala.analysis.Analyzer.resolvePath(Analyzer.java:753)
> at 
> org.apache.impala.analysis.DescribeTableStmt.analyze(DescribeTableStmt.java:106)
> at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:388)
> at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:369)
> at org.apache.impala.service.Frontend.analyzeStmt(Frontend.java:920)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1069)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> I1210 13:45:12.371624 17040 status.cc:125] AnalysisException: Could not 
> resolve path: 'tpch_nested_parquet.part'
> @   0x9597f9  impala::Status::Status()
> @   0xc9df62  impala::JniUtil::GetJniExceptionMsg()
> @   0xba2a7b  impala::Frontend::GetExecRequest()
> @   0xbc0558  impala::ImpalaServer::ExecuteInternal()
> @   0xbc6858  impala::ImpalaServer::Execute()
> @   0xc2244e  impala::ImpalaServer::ExecuteStatement()
> @  0x10a8326  
> apache::hive::service::cli::thrift::TCLIServiceProcessor::process_ExecuteStatement()
> @  0x10a1f44  
> apache::hive::service::cli::thrift::TCLIServiceProcessor::dispatchCall()
> @   0x929ecc  apache::thrift::TDispatchProcessor::process()
> @   0xafa43f  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @   0xaf4d35  impala::ThriftThread::RunRunnable()
> @   0xaf5b12  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @   0xd10b63  impala::Thread::SuperviseThread()
> @   0xd112a4  boost::detail::thread_data<>::run()
> @  0x128afda  (unknown)
> @ 0x7f2a6a7a8e25  start_thread
> @ 0x7f2a6a4d634d  __clone
> I1210 13:45:12.371713 17040 impala-server.cc:992] UnregisterQuery(): 
> query_id=2748b77529da2004:7602cd6d
> I1210 13

  1   2   >