[jira] [Created] (IMPALA-9726) Update boilerplate in the PyPI sidebar for impala-shell supported versions
David Knupp created IMPALA-9726: --- Summary: Update boilerplate in the PyPI sidebar for impala-shell supported versions Key: IMPALA-9726 URL: https://issues.apache.org/jira/browse/IMPALA-9726 Project: IMPALA Issue Type: Sub-task Components: Clients Affects Versions: Impala 4.0 Reporter: David Knupp The following lines need to be updated to reflect that the shell now supports python 2.7+ and 3+. https://github.com/apache/impala/blob/master/shell/packaging/setup.py#L164-167 {noformat} 'Programming Language :: Python :: 2 :: Only', 'Programming Language :: Python :: 2.6', 'Programming Language :: Python :: 2.7', {noformat} Note that this has no effect on the actual installation. This line is what manages that, and its value is correct for both Impala 3.4.0 and Impala 4.0: https://github.com/apache/impala/blob/master/shell/packaging/setup.py#L138 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9719) Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py
[ https://issues.apache.org/jira/browse/IMPALA-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9719. - Fix Version/s: Impala 4.0 Resolution: Fixed > Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py > -- > > Key: IMPALA-9719 > URL: https://issues.apache.org/jira/browse/IMPALA-9719 > Project: IMPALA > Issue Type: Sub-task > Components: Clients >Affects Versions: Impala 4.0 >Reporter: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > Needed for python 3 compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9720) Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py
[ https://issues.apache.org/jira/browse/IMPALA-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9720. - Fix Version/s: Impala 4.0 Resolution: Fixed > Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py > --- > > Key: IMPALA-9720 > URL: https://issues.apache.org/jira/browse/IMPALA-9720 > Project: IMPALA > Issue Type: Sub-task > Components: Clients >Affects Versions: Impala 4.0 >Reporter: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > This is needed for python 3 compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9718) Remove pkg_resources.py from Impala/shell
[ https://issues.apache.org/jira/browse/IMPALA-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9718. - Fix Version/s: Impala 4.0 Resolution: Fixed > Remove pkg_resources.py from Impala/shell > - > > Key: IMPALA-9718 > URL: https://issues.apache.org/jira/browse/IMPALA-9718 > Project: IMPALA > Issue Type: Sub-task > Components: Clients >Affects Versions: Impala 4.0 >Reporter: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > pkg_resources is available in the stdlib. There should be no need to bundle > it with the shell. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9721) Fix python 3 compatibility regression in impala-shell
[ https://issues.apache.org/jira/browse/IMPALA-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9721. - Resolution: Fixed > Fix python 3 compatibility regression in impala-shell > - > > Key: IMPALA-9721 > URL: https://issues.apache.org/jira/browse/IMPALA-9721 > Project: IMPALA > Issue Type: Bug > Components: Clients >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > > The fix for IMPALA-9398 introduced a small regression with regard to python 3 > compatibility. We don't have python 3 tests yet to catch regression of this > this type, and it was missed in code review. > The regression happens in two places. An example is: > https://github.com/apache/impala/blob/master/shell/impala_shell.py#L248 > The syntax for catching exceptions has changed in python 3 to require the > "as" keyword. > {noformat} > try: > do_stuff() > except as e: > panic() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9724) Setup python3 compile or else 'pylint --py3k' checking on jenkins.impala.io
David Knupp created IMPALA-9724: --- Summary: Setup python3 compile or else 'pylint --py3k' checking on jenkins.impala.io Key: IMPALA-9724 URL: https://issues.apache.org/jira/browse/IMPALA-9724 Project: IMPALA Issue Type: Sub-task Affects Versions: Impala 4.0 Reporter: David Knupp Until we get python3 testing integrated into the actual mini-cluster stack, we should be able to add {{pylint --py3k}} check to the upstream build pipeline, similar to how we used to check for python 2.6 compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9648) Exclude or update netty jar
[ https://issues.apache.org/jira/browse/IMPALA-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9648. - Resolution: Fixed > Exclude or update netty jar > --- > > Key: IMPALA-9648 > URL: https://issues.apache.org/jira/browse/IMPALA-9648 > Project: IMPALA > Issue Type: Task >Reporter: Abhishek Rawat >Assignee: David Knupp >Priority: Major > > Add exclusion for netty if not being used. Or update it to version 4.1.44 or > later. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9721) Fix python 3 compatibility regression
David Knupp created IMPALA-9721: --- Summary: Fix python 3 compatibility regression Key: IMPALA-9721 URL: https://issues.apache.org/jira/browse/IMPALA-9721 Project: IMPALA Issue Type: Bug Components: Clients Reporter: David Knupp Assignee: David Knupp The fix for IMPALA-9398 introduced a small regression with regard to python 3 compatibility. We don't have python 3 tests yet to catch regression of this this type, and it was missed in code review. The regression happens in two places. An example is: https://github.com/apache/impala/blob/master/shell/impala_shell.py#L248 The syntax for catching exceptions has changed in python 3 to require the "as" keyword. {noformat} try: do_stuff() except as e: panic() {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download
[ https://issues.apache.org/jira/browse/IMPALA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9649. - Resolution: Fixed > Exclude shiro-crypto-core and shiro-core jars from maven download > - > > Key: IMPALA-9649 > URL: https://issues.apache.org/jira/browse/IMPALA-9649 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > These jars have known security vulnerabilities. They are included as part of > Sentry, and are not used by Impala directly. > There's a currently a plan to remove Sentry altogether, but since will > require non-trivial effort, until that time, let's exclude these items from > the maven download. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9720) Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py
David Knupp created IMPALA-9720: --- Summary: Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py Key: IMPALA-9720 URL: https://issues.apache.org/jira/browse/IMPALA-9720 Project: IMPALA Issue Type: Sub-task Components: Clients Affects Versions: Impala 4.0 Reporter: David Knupp This is needed for python 3 compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9719) Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py
David Knupp created IMPALA-9719: --- Summary: Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py Key: IMPALA-9719 URL: https://issues.apache.org/jira/browse/IMPALA-9719 Project: IMPALA Issue Type: Sub-task Components: Clients Affects Versions: Impala 4.0 Reporter: David Knupp Needed for python 3 compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9718) Remove pkg_resources.py from Impala/shell
David Knupp created IMPALA-9718: --- Summary: Remove pkg_resources.py from Impala/shell Key: IMPALA-9718 URL: https://issues.apache.org/jira/browse/IMPALA-9718 Project: IMPALA Issue Type: Sub-task Components: Clients Affects Versions: Impala 4.0 Reporter: David Knupp pkg_resources is available in the stdlib. There should be no need to bundle it with the shell. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-8608) Impala needs to support Python 3
[ https://issues.apache.org/jira/browse/IMPALA-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-8608. - Fix Version/s: Not Applicable Resolution: Duplicate > Impala needs to support Python 3 > > > Key: IMPALA-8608 > URL: https://issues.apache.org/jira/browse/IMPALA-8608 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Priority: Critical > Fix For: Not Applicable > > > [The End of Python 2.7|https://pythonclock.org/] support is getting closer > and as such we need to be able to move to Python 3. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9648) Exclude or update netty jar
[ https://issues.apache.org/jira/browse/IMPALA-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9648. - Target Version: Impala 4.0 Resolution: Fixed > Exclude or update netty jar > --- > > Key: IMPALA-9648 > URL: https://issues.apache.org/jira/browse/IMPALA-9648 > Project: IMPALA > Issue Type: Task >Reporter: Abhishek Rawat >Assignee: David Knupp >Priority: Major > > Add exclusion for netty if not being used. Or update it to version 4.1.44 or > later. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9647) Exclude or update fluent-hc jar
[ https://issues.apache.org/jira/browse/IMPALA-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9647. - Fix Version/s: Impala 4.0 Resolution: Fixed > Exclude or update fluent-hc jar > --- > > Key: IMPALA-9647 > URL: https://issues.apache.org/jira/browse/IMPALA-9647 > Project: IMPALA > Issue Type: Task >Reporter: Abhishek Rawat >Assignee: David Knupp >Priority: Blocker > Fix For: Impala 4.0 > > > Add exclusion for fluent-hc-4.3.2.jar or upgrade it to 4.3.6 or later version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9489) Setup impala-shell.sh env separately, and use thrift-0.11.0 by default
[ https://issues.apache.org/jira/browse/IMPALA-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9489. - Fix Version/s: Impala 4.0 Resolution: Fixed > Setup impala-shell.sh env separately, and use thrift-0.11.0 by default > -- > > Key: IMPALA-9489 > URL: https://issues.apache.org/jira/browse/IMPALA-9489 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > [Note: this JIRA was filed in relation to the ongoing effort to make the > impala-shell compatible with python 3] > The impala python development environment is a fairly convoluted affair -- a > number of packages are installed in the infra/python/env, some of it comes > from the toolchain, some of it is generated and lives in the shell directory. > Generally speaking, if you launch impala-python and import a module, it's not > necessarily easy to predict where the module might live. > {noformat} > $ python > Python 2.7.10 (default, Aug 17 2018, 19:45:58) > [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import sasl > >>> sasl > '/home/systest/Impala/shell/ext-py/sasl-0.1.1/dist/sasl-0.1.1-py2.7-linux-x86_64.egg/sasl/__init__.pyc'> > >>> import requests > >>> requests > '/home/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/requests/__init__.pyc'> > >>> import Logging > >>> Logging > '/home/systest/Impala/shell/gen-py/Logging/__init__.pyc'> > >>> import thrift > >>> thrift > '/home/systest/Impala/toolchain/thrift-0.9.3-p7/python/lib/python2.7/site-packages/thrift/__init__.pyc'> > {noformat} > Really, there is no one coherent environment -- there's just whatever > collection of modules happens to be available at a given time for a given > type of invocation, all of which is accomplished behind the scenes by calling > scripts like {{bin/set-pythonpath.sh}} and {{bin/impala-python-common.sh}} > that are responsible for cobbling together a PYTHONPATH based on known > locations and current env variables. > As far as I can tell, there are three important contexts where python comes > into play... > * during the build process (used during data load, e.g., > testdata/bin/load_nested.py) > * when running the py.test bases e2e tests > * whenever the impala-shell is invoked > As noted by IMPALA-7825 (and also in a conversation I had with > [~stakiar_impala_496e]), we're dependent on thrift 0.9.3 during the build > process. This seems to come into play during the loading of test data > (specifically, when calling testdata/bin/load_nested.py) mainly because at > one point there was some well-intentioned but probably misguided attempt at > code reuse from the test framework. The test code that gets re-used involves > impyla and/or thrift-sasl, which currently still relies on thrift 0.9.3. So > our test framework, and by extension the build, both inherit the same > limitation. > The impala-shell, on the other hand, luckily doesn't directly reuse any of > the same test modules, and there really is no need to keep it pinned to > 0.9.3. However, since calling the impala-shell.sh winds up invoking > {{set-pythonpath.sh}}, the same script that script sets up the environment > during building or testing, thrift 0.9.3 just kind of leaks over by default. > As it turns out, thrift 0.9.3 is also one of the many limitations restricting > the impala-shell to python 2. Luckily, with IMPALA-7924 resolved, > thrift-0.11.0 is available -- we just have to use it. And the way to > accomplish that is by decoupling the impala-shell from relying either > {{set-pythonpath.sh}} or {{impala-python-common.sh}}. > As a first pass, we can address the dev environment by just having > {{impala-shell.sh}} itself do whatever is required to find python > dependencies, and we can specify thrift-0.11.0 there. Also, thrift 0.11.0 > should be used by both of the scripts used to create the tarballs that > package the impala-shell for customer environments. Neither of these should > adversely building Impala or running the py.test test framework. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9501) Upgrade sqlparse to a version that supports python 3.0
[ https://issues.apache.org/jira/browse/IMPALA-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9501. - Fix Version/s: Impala 4.0 Resolution: Fixed > Upgrade sqlparse to a version that supports python 3.0 > -- > > Key: IMPALA-9501 > URL: https://issues.apache.org/jira/browse/IMPALA-9501 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > The current version (0.1.19) was selected, per IMPALA-6999. because it's the > last version to be compatible with python 2.6. However, it's not compatible > with python 3.x. > {noformat} > Traceback (most recent call last): > File > "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", > line 37, in > import sqlparse > File "", line 983, in _find_and_load > File "", line 967, in _find_and_load_unlocked > File "", line 668, in _load_unlocked > File "", line 638, in _load_backward_compatible > File > "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/__init__.py", > line 13, in > File "", line 983, in _find_and_load > File "", line 967, in _find_and_load_unlocked > File "", line 668, in _load_unlocked > File "", line 638, in _load_backward_compatible > File > "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/engine/__init__.py", > line 8, in > File "", line 983, in _find_and_load > File "", line 963, in _find_and_load_unlocked > File "", line 906, in _find_spec > File "", line 1280, in find_spec > File "", line 1254, in _get_spec > File "", line 1235, in > _legacy_get_spec > File "", line 441, in spec_from_loader > File "", line 594, in > spec_from_file_location > File > "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/lexer.py", > line 84 > except Exception, err: > ^ > SyntaxError: invalid syntax > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9362) Update sqlparse used by impala-shell from version 0.1.19 to latest
[ https://issues.apache.org/jira/browse/IMPALA-9362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9362. - Fix Version/s: Impala 4.0 Resolution: Fixed > Update sqlparse used by impala-shell from version 0.1.19 to latest > -- > > Key: IMPALA-9362 > URL: https://issues.apache.org/jira/browse/IMPALA-9362 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.4.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > The fix for IMPALA-6337 involved correcting the way that sqlparse, an > upstream 3rd party python library used by the impala-shell, parses queries > that contain line breaks embedded inside of double quotes. Initially, > Impala's internally bundled version of sqlparse (based on 0.1.19) was > patched; meanwhile, a pull request to get the fix into an official release > was submitted. > That pull-request was finally included in the 0.3.0 version of sqlparse. > However, there were other changes to the library in the interim, in terms of > API's and also in some of the parsing logic, that breaks the impala-shell in > other ways, so simply migrating to the newer release is not straightforward. > We need to find and fix all the places that the newer sqlparse breaks the > impala-shell, so that we can stop relying on sqlparse 0.1.19 (which, in some > places, is not python 3 compatible). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download
[ https://issues.apache.org/jira/browse/IMPALA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9649. - Resolution: Fixed > Exclude shiro-crypto-core and shiro-core jars from maven download > - > > Key: IMPALA-9649 > URL: https://issues.apache.org/jira/browse/IMPALA-9649 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 4.0 > > > These jars have known security vulnerabilities. They are included as part of > Sentry, and are not used by Impala directly. > There's a currently a plan to remove Sentry altogether, but since will > require non-trivial effort, until that time, let's exclude these items from > the maven download. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9649) Exclude shiro-crypto-core and shiro-core jars from maven download
David Knupp created IMPALA-9649: --- Summary: Exclude shiro-crypto-core and shiro-core jars from maven download Key: IMPALA-9649 URL: https://issues.apache.org/jira/browse/IMPALA-9649 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 4.0 Reporter: David Knupp Fix For: Impala 4.0 These jars have known security vulnerabilities. They are included as part of Sentry, and are not used by Impala directly. There's a currently a plan to remove Sentry altogether, but since will require non-trivial effort, until that time, let's exclude these items from the maven download. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
[ https://issues.apache.org/jira/browse/IMPALA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9582. - Resolution: Fixed > Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell > -- > > Key: IMPALA-9582 > URL: https://issues.apache.org/jira/browse/IMPALA-9582 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 3.4.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Blocker > Fix For: Impala 3.4.0 > > > thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was > not reading all data, causing clients to hang. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
David Knupp created IMPALA-9582: --- Summary: Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell Key: IMPALA-9582 URL: https://issues.apache.org/jira/browse/IMPALA-9582 Project: IMPALA Issue Type: Improvement Components: Clients Affects Versions: Impala 3.4.0 Reporter: David Knupp Fix For: Impala 3.4.0 thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was not reading all data, causing clients to hang. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9501) Upgrade sqlparse to a version that supports python 3.0
David Knupp created IMPALA-9501: --- Summary: Upgrade sqlparse to a version that supports python 3.0 Key: IMPALA-9501 URL: https://issues.apache.org/jira/browse/IMPALA-9501 Project: IMPALA Issue Type: Improvement Components: Infrastructure Reporter: David Knupp The current version (0.1.19) was selected, per IMPALA-6999. because it's the last version to be compatible with python 2.6. However, it's not compatible with python 3.x. {noformat} Traceback (most recent call last): File "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", line 37, in import sqlparse File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/__init__.py", line 13, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/engine/__init__.py", line 8, in File "", line 983, in _find_and_load File "", line 963, in _find_and_load_unlocked File "", line 906, in _find_spec File "", line 1280, in find_spec File "", line 1254, in _get_spec File "", line 1235, in _legacy_get_spec File "", line 441, in spec_from_loader File "", line 594, in spec_from_file_location File "/home/dknupp/Impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/lexer.py", line 84 except Exception, err: ^ SyntaxError: invalid syntax {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9489) Have impala-python env and impala-shell use available thrift-0.11.0 files
David Knupp created IMPALA-9489: --- Summary: Have impala-python env and impala-shell use available thrift-0.11.0 files Key: IMPALA-9489 URL: https://issues.apache.org/jira/browse/IMPALA-9489 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 3.4.0 Reporter: David Knupp Assignee: David Knupp Apparently, we can't simply kick thrift-0.9.3 to the curb. Our build process needs some attention before that can happen, per IMPALA-7825, and a also conversation I had with [~stakiar_impala_496e]. However, with IMPALA-7924 resolved, we do have access to thrift-0.11.0 python files, and we should use those by default. It turns out that being stuck with thrift-0.9.3 is a major impediment to achieving python 3 compatibility for our python stack. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9424) Add six python library to shell/ext-py
[ https://issues.apache.org/jira/browse/IMPALA-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9424. - Resolution: Fixed > Add six python library to shell/ext-py > -- > > Key: IMPALA-9424 > URL: https://issues.apache.org/jira/browse/IMPALA-9424 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: David Knupp >Priority: Major > > A couple of impala-shell changes that are coming in the near future > (thrift_sasl update, possible changes to THttpClient, python 3 support) will > require the six python library. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9424) Add six python library to shell/ext-py
David Knupp created IMPALA-9424: --- Summary: Add six python library to shell/ext-py Key: IMPALA-9424 URL: https://issues.apache.org/jira/browse/IMPALA-9424 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 3.4.0 Reporter: David Knupp A couple of impala-shell changes that are coming in the near future (thrift_sasl update, possible changes to THttpClient, python 3 support) will require the six python library. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9362) Update sqlparse used by impala-shell from version 0.1.19 to latest
David Knupp created IMPALA-9362: --- Summary: Update sqlparse used by impala-shell from version 0.1.19 to latest Key: IMPALA-9362 URL: https://issues.apache.org/jira/browse/IMPALA-9362 Project: IMPALA Issue Type: Improvement Components: Clients Affects Versions: Impala 3.4.0 Reporter: David Knupp The fix for IMPALA-6337 involved correcting the way that sqlparse, an upstream 3rd party python library used by the impala-shell, parses queries that contain line breaks embedded inside of double quotes. Initially, Impala's internally bundled version of sqlparse (based on 0.1.19) was patched; meanwhile, a pull request to get the fix into an official release was submitted. That pull-request was finally included in the 0.3.0 version of sqlparse. However, there were other changes to the library in the interim, in terms of API's and also in some of the parsing logic, that breaks the impala-shell in other ways, so simply migrating to the newer release is not straightforward. We need to find and fix all the places that the newer sqlparse breaks the impala-shell, so that we can stop relying on sqlparse 0.1.19 (which, in some places, is not python 3 compatible). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9157) TestAuthorizationProvider.test_invalid_provider_flag fails due to Python 2.6 incompatible code
[ https://issues.apache.org/jira/browse/IMPALA-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9157. - Fix Version/s: Impala 3.4.0 Resolution: Fixed > TestAuthorizationProvider.test_invalid_provider_flag fails due to Python 2.6 > incompatible code > -- > > Key: IMPALA-9157 > URL: https://issues.apache.org/jira/browse/IMPALA-9157 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Joe McDonnell >Assignee: David Knupp >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.4.0 > > > Our Centos 6 builds use Python 2.6, which means that it doesn't have > check_output (added in Python 2.7). This causes test failures in > test_provider.py: > > {noformat} > authorization/test_provider.py:70: in setup_method > self.pre_test_cores = set([f for f in possible_cores if is_core_dump(f)]) > ../lib/python/impala_py_lib/helpers.py:64: in is_core_dump > file_std_out = exec_local_command("file %s" % file_path) > ../lib/python/impala_py_lib/helpers.py:34: in exec_local_command > return subprocess.check_output(cmd.split()) > E AttributeError: 'module' object has no attribute 'check_output'{noformat} > This comes from the new code to handle intentional core dumps: > > [https://github.com/apache/impala/blob/master/lib/python/impala_py_lib/helpers.py#L34] > {noformat} > def exec_local_command(cmd): > """ Executes a command for the local bash shell and return stdout as a > string. > Args: > cmd: command as a string > Return: > STDOUT > """ > return subprocess.check_output(cmd.split()){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps
[ https://issues.apache.org/jira/browse/IMPALA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-9129. - Resolution: Fixed > Provide a way for negative tests to remove intentionally generated core dumps > - > > Key: IMPALA-9129 > URL: https://issues.apache.org/jira/browse/IMPALA-9129 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > > Occasionally, tests (esp. custom cluster tests) will inject an error or set > some invalid config, expecting Impala to generate a core dump. > We should have a general way for such files to delete the bogus core dumps, > otherwise they can complicate/confuse later triaging efforts of legitimate > test failures. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps
David Knupp created IMPALA-9129: --- Summary: Provide a way for negative tests to remove intentionally generated core dumps Key: IMPALA-9129 URL: https://issues.apache.org/jira/browse/IMPALA-9129 Project: IMPALA Issue Type: Improvement Components: Infrastructure Reporter: David Knupp Occasionally, tests (esp. custom cluster tests) will perform some action, expecting Impala to generate a core dump. We should have a general way for such files to delete the bogus core dumps, otherwise they can complicate/confuse later test triaging efforts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-1071) Install impala-shell from PyPI
[ https://issues.apache.org/jira/browse/IMPALA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-1071. - Resolution: Fixed > Install impala-shell from PyPI > -- > > Key: IMPALA-1071 > URL: https://issues.apache.org/jira/browse/IMPALA-1071 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 1.3.1 >Reporter: Shinya Okano >Assignee: David Knupp >Priority: Minor > Labels: shell > > I want to install impala-shell from PyPI (Python Package Index). > impala-shell appears to have been made with the Python module, a shell script > a little. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-8877) CatalogException: Table modified while operation was in progress, aborting execution.
David Knupp created IMPALA-8877: --- Summary: CatalogException: Table modified while operation was in progress, aborting execution. Key: IMPALA-8877 URL: https://issues.apache.org/jira/browse/IMPALA-8877 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 3.3.0 Reporter: David Knupp Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz This was hit while running the stress tests to get a baseline on a deployed cluster. /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */ COMPUTE STATS catalog_sales {noformat} Query (id=924a50178a5a6146:29d58a73) Summary Session ID: 5543fb9029e2b71f:f446381b1f59ed81 Session Type: HIVESERVER2 HiveServer2 Protocol Version: V6 Start Time: 2019-08-19 01:26:07.292866000 End Time: 2019-08-19 01:26:27.248053000 Query Type: DDL Query State: EXCEPTION Query Status: CatalogException: Table 'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in progress, aborting execution. Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build df3e7c051e2641524fc53a0cd07c2a14decd55f7) User: syst...@vpc.cloudera.com Connected User: syst...@vpc.cloudera.com Delegated User: Network Address: :::10.65.6.19:39174 Default Db: tpcds_300_decimal_parquet Sql Statement: /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */ COMPUTE STATS catalog_sales Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000 Query Options (set by configuration): ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 Query Options (set by configuration and planner): ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 DDL Type: COMPUTE_STATS Query Compilation Metadata of all 1 tables cached: 5.62s (5622372318) Analysis finished: 5.62s (5622560027) Authorization finished (noop): 5.62s (5622568284) Retried query planning due to inconsistent metadata 7 of 40 times: Catalog object TCatalogObject(type:TABLE, catalog_version:94204, table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) changed version between accesses.: 5.95s (5949859598) Planning finished: 5.95s (5949861145) Query Timeline Query submitted: 0ns (0) Planning finished: 5.95s (5950024020) Child queries finished: 17.85s (17849072057) Rows available: 19.82s (19825080035) Unregister query: 19.95s (19955080560) Frontend - CatalogFetch.ColumnStats.Misses: 34 (34) - CatalogFetch.ColumnStats.Requests: 34 (34) - CatalogFetch.ColumnStats.Time: 0 (0) - CatalogFetch.Config.Hits: 1 (1) - CatalogFetch.Config.Requests: 1 (1) - CatalogFetch.Config.Time: 0 (0) - CatalogFetch.DatabaseList.Hits: 8 (8) - CatalogFetch.DatabaseList.Requests: 8 (8) - CatalogFetch.DatabaseList.Time: 0 (0) - CatalogFetch.PartitionLists.Misses: 1 (1) - CatalogFetch.PartitionLists.Requests: 1 (1) - CatalogFetch.PartitionLists.Time: 7 (7) - CatalogFetch.Partitions.Hits: 1837 (1837) - CatalogFetch.Partitions.Misses: 1837 (1837) - CatalogFetch.Partitions.Requests: 3674 (3674) - CatalogFetch.Partitions.Time: 325 (325) - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030) - CatalogFetch.RPCs.Requests: 22 (22) - CatalogFetch.RPCs.Time: 343 (343) - CatalogFetch.TableNames.Hits: 4 (4) - CatalogFetch.TableNames.Misses: 4 (4) - CatalogFetch.TableNames.Requests: 8 (8) - CatalogFetch.TableNames.Time: 0 (0) - CatalogFetch.Tables.Misses: 8 (8) - CatalogFetch.Tables.Requests: 8 (8) - CatalogFetch.Tables.Time: 74 (74) - InactiveTotalTime: 0ns (0) - TotalTime: 0ns (0) ImpalaServer - CatalogOpExecTimer: 1.97s (1972007962) - ClientFetchWaitTimer: 0ns (0) - InactiveTotalTime: 0ns (0) - RowMaterializationTimer: 0ns (0) - TotalTime: 0ns (0) Child Queries Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45) Column Stats Query (id=0444367557e3496d:f9435111) {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8558) test_char_format failing on deployed clusters because chars_formats table does not exist
David Knupp created IMPALA-8558: --- Summary: test_char_format failing on deployed clusters because chars_formats table does not exist Key: IMPALA-8558 URL: https://issues.apache.org/jira/browse/IMPALA-8558 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: David Knupp This issue is showing up for databases functional, functional_parquet, and functional_orc_def. It's not immediately clear why this table wasn't created during data load. But for example: *Stacktrace* {noformat} query_test/test_chars.py:75: in test_char_format self.run_test_case('QueryTest/chars-formats', vector) common/impala_test_suite.py:512: in run_test_case result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:746: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:180: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:362: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:356: in execute_query_async handle = self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:516: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: AnalysisException: Could not resolve table reference: 'chars_formats' {noformat} *Standard Error* {noformat} SET client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text; -- connecting to: quasar-nsgjqi-4.vpc.cloudera.com:21000 -- connecting to quasar-nsgjqi-4.vpc.cloudera.com:21050 with impyla -- 2019-05-15 00:05:55,759 INFO MainThread: Closing active operation SET client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text; -- executing against quasar-nsgjqi-4.vpc.cloudera.com:21000 use functional; -- 2019-05-15 00:05:55,788 INFO MainThread: Started query f845123ce97ab647:0560abac SET client_identifier=query_test/test_chars.py::TestCharFormats::()::test_char_format[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_format:text; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET exec_single_node_rows_threshold=0; -- executing against quasar-nsgjqi-4.vpc.cloudera.com:21000 select * from chars_formats order by vc; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8556) test_large_num_partitions failing on deployed clusters because scale_db was not created
David Knupp created IMPALA-8556: --- Summary: test_large_num_partitions failing on deployed clusters because scale_db was not created Key: IMPALA-8556 URL: https://issues.apache.org/jira/browse/IMPALA-8556 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: David Knupp It's not immediately clear why scale_db is not being created during data load. catalog_service/test_large_num_partitions.py:41: in test_list_partitions *Stacktrace* {noformat} catalog_service/test_large_num_partitions.py:41: in test_list_partitions result = self.client.execute("show table stats %s" % full_tbl_name) common/impala_connection.py:180: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:362: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:356: in execute_query_async handle = self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:516: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: AnalysisException: Database does not exist: scale_db {noformat} catalog_service/test_large_num_partitions.py:62: in test_predicates_on_partition_attributes *Stacktrace* {noformat} catalog_service/test_large_num_partitions.py:62: in test_predicates_on_partition_attributes result = self.client.execute("select * from %s where j = 1" % full_tbl_name) common/impala_connection.py:180: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:362: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:356: in execute_query_async handle = self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:516: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: AnalysisException: Could not resolve table reference: 'scale_db.num_partitions_1234_blocks_per_partition_1' {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8553) Several tests failing with connection errors on deployed clusters
David Knupp created IMPALA-8553: --- Summary: Several tests failing with connection errors on deployed clusters Key: IMPALA-8553 URL: https://issues.apache.org/jira/browse/IMPALA-8553 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: David Knupp Assignee: Tim Armstrong The errors look fairly similar. I suspect this commit introduced a regression: https://github.com/apache/impala/commit/79c5f875 *Stacktrace* {noformat} metadata/test_hms_integration.py:66: in test_sanity if IMPALA_TEST_CLUSTER_PROPERTIES.is_catalog_v2_cluster(): common/environ.py:307: in is_catalog_v2_cluster flags = self._get_flags_from_web_ui(web_ui_url) common/environ.py:295: in _get_flags_from_web_ui response = requests.get(impala_url + "/varz?json") ../infra/python/env/lib/python2.7/site-packages/requests/api.py:69: in get return request('get', url, params=params, **kwargs) ../infra/python/env/lib/python2.7/site-packages/requests/api.py:50: in request response = session.request(method=method, url=url, **kwargs) ../infra/python/env/lib/python2.7/site-packages/requests/sessions.py:465: in request resp = self.send(prep, **send_kwargs) ../infra/python/env/lib/python2.7/site-packages/requests/sessions.py:573: in send r = adapter.send(request, **kwargs) ../infra/python/env/lib/python2.7/site-packages/requests/adapters.py:415: in send raise ConnectionError(err, request=request) E ConnectionError: ('Connection aborted.', error(111, 'Connection refused')) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8552) impala-shell tests break on remote clusters if IMPALA_LOCAL_BUILD_VERSION is None
David Knupp created IMPALA-8552: --- Summary: impala-shell tests break on remote clusters if IMPALA_LOCAL_BUILD_VERSION is None Key: IMPALA-8552 URL: https://issues.apache.org/jira/browse/IMPALA-8552 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: David Knupp Assignee: Tim Armstrong This is a regression introduced by the commit: https://github.com/apache/impala/commit/b55d905 *Stacktrace* {noformat} shell/test_shell_commandline.py:33: in from util import (get_impalad_host_port, assert_var_substitution, run_impala_shell_cmd, shell/util.py:42: in IMPALA_HOME, "shell/build", "impala-shell-" + IMPALA_LOCAL_BUILD_VERSION, E TypeError: cannot concatenate 'str' and 'NoneType' objects {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error
[ https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-8481. - Resolution: Fixed > test_hbase_col_filter failing on deployed clusters due to permissions error > --- > > Key: IMPALA-8481 > URL: https://issues.apache.org/jira/browse/IMPALA-8481 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: David Knupp >Priority: Critical > Fix For: Impala 3.3.0 > > > When running test_hbase_queries against a deployed cluster, the default user > on the machine running the tests may not have the correct access permission > on the cluster, which causes this test to fail. > {noformat} > query_test/test_hbase_queries.py:89: in test_hbase_col_filter > self.run_stmt_in_hive(add_data) > common/impala_test_suite.py:800: in run_stmt_in_hive > raise RuntimeError(stderr) > [...] > E INFO : Query ID = > hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f > E INFO : Total jobs = 1 > E INFO : Starting task [Stage-0:DDL] in serial mode > E INFO : Launching Job 1 out of 1 > E INFO : Starting task [Stage-1:MAPRED] in serial mode > E INFO : Number of reduce tasks is set to 0 since there's no reduce > operator > E ERROR : Job Submission failed with exception > 'org.apache.hadoop.security.AccessControlException(Permission denied: > user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error
David Knupp created IMPALA-8481: --- Summary: test_hbase_col_filter failing on deployed clusters due to permissions error Key: IMPALA-8481 URL: https://issues.apache.org/jira/browse/IMPALA-8481 Project: IMPALA Issue Type: Bug Components: Infrastructure Reporter: David Knupp Fix For: Impala 3.3.0 When running test_hbase_queries against a deployed cluster, the default user on the machine running the tests may not have the correct access permission on the cluster, which causes this test to fail. {noformat} query_test/test_hbase_queries.py:89: in test_hbase_col_filter self.run_stmt_in_hive(add_data) common/impala_test_suite.py:800: in run_stmt_in_hive raise RuntimeError(stderr) [...] E INFO : Query ID = hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f E INFO : Total jobs = 1 E INFO : Starting task [Stage-0:DDL] in serial mode E INFO : Launching Job 1 out of 1 E INFO : Starting task [Stage-1:MAPRED] in serial mode E INFO : Number of reduce tasks is set to 0 since there's no reduce operator E ERROR : Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8465) hs2.test_json_endpoints.TestJsonEndpoints fails on deployed clusters
David Knupp created IMPALA-8465: --- Summary: hs2.test_json_endpoints.TestJsonEndpoints fails on deployed clusters Key: IMPALA-8465 URL: https://issues.apache.org/jira/browse/IMPALA-8465 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: David Knupp Changes to tests.common.ImpalaCluster in [commit 2ca7f8e7|https://github.com/apache/impala/commit/2ca7f8e7c0781a1914275b3506cf8a7748c44c85#diff-6fea89ad0e6c440b0373bb136d7510b5] introduced a regression in this test. {noformat} hs2/test_json_endpoints.py:51: in test_waiting_in_flight_queries queries_json = self._get_json_queries(http_addr) hs2/test_json_endpoints.py:33: in _get_json_queries return cluster.impalads[0].service.get_debug_webpage_json("/queries") E IndexError: list index out of range {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8393) setup-ranger step in create-load-data.sh breaks data load to real clusters
David Knupp created IMPALA-8393: --- Summary: setup-ranger step in create-load-data.sh breaks data load to real clusters Key: IMPALA-8393 URL: https://issues.apache.org/jira/browse/IMPALA-8393 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.2.0, Impala 3.3.0 Reporter: David Knupp {{localhost}} is hard-coded into the setup-ranger function that was recently added to create-load-data.sh, e.g.: https://github.com/apache/impala/blame/master/testdata/bin/create-load-data.sh#L325 This works when testing on a mini-cluster, but breaks data load if setting up to run the functional test suite against an actual cluster. In that scenario, the host that runs the script is simply a test runner, with no locally running services. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8346) Testcase builder for query planner introduced a data load regression on CDH clusters
David Knupp created IMPALA-8346: --- Summary: Testcase builder for query planner introduced a data load regression on CDH clusters Key: IMPALA-8346 URL: https://issues.apache.org/jira/browse/IMPALA-8346 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.2.0, Impala 3.3.0 Reporter: David Knupp Assignee: bharath v The patch to address IMPALA-5872 introduced a new script into our data load process. This script has been tested against the single-node mini-cluster, but doesn't appear to run against actual (remote) clusters. {noformat} Starting Impala Shell without Kerberos authentication Opened TCP connection to remote-coordinator-node.mycompany.com:21000 Connected to remote-coordinator-node.mycompany.com:21000 Server version: impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 2e55383eb86de20499e2f9327cd5bcbda6788e50) Query: use `tpcds` Query: use `tpcds` Query: COPY TESTCASE TO '/test-warehouse/tpcds-testcase-data' -- start query 1 in stream 0 using template query11.tpl with year_total as ( select c_customer_id customer_id ,c_first_name customer_first_name ,c_last_name customer_last_name ,c_preferred_cust_flag customer_preferred_cust_flag ,c_birth_country customer_birth_country ,c_login customer_login ,c_email_address customer_email_address ,d_year dyear ,sum(ss_ext_list_price-ss_ext_discount_amt) year_total ,'s' sale_type from customer ,store_sales ,date_dim where c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk group by c_customer_id ,c_first_name ,c_last_name ,c_preferred_cust_flag ,c_birth_country ,c_login ,c_email_address ,d_year union all select c_customer_id customer_id ,c_first_name customer_first_name ,c_last_name customer_last_name ,c_preferred_cust_flag customer_preferred_cust_flag ,c_birth_country customer_birth_country ,c_login customer_login ,c_email_address customer_email_address ,d_year dyear ,sum(ws_ext_list_price-ws_ext_discount_amt) year_total ,'w' sale_type from customer ,web_sales ,date_dim where c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk group by c_customer_id ,c_first_name ,c_last_name ,c_preferred_cust_flag ,c_birth_country ,c_login ,c_email_address ,d_year ) select t_s_secyear.customer_id ,t_s_secyear.customer_first_name ,t_s_secyear.customer_last_name ,t_s_secyear.customer_email_address from year_total t_s_firstyear ,year_total t_s_secyear ,year_total t_w_firstyear ,year_total t_w_secyear where t_s_secyear.customer_id = t_s_firstyear.customer_id and t_s_firstyear.customer_id = t_w_secyear.customer_id and t_s_firstyear.customer_id = t_w_firstyear.customer_id and t_s_firstyear.sale_type = 's' and t_w_firstyear.sale_type = 'w' and t_s_secyear.sale_type = 's' and t_w_secyear.sale_type = 'w' and t_s_firstyear.dyear = 2001 and t_s_secyear.dyear = 2001+1 and t_w_firstyear.dyear = 2001 and t_w_secyear.dyear = 2001+1 and t_s_firstyear.year_total > 0 and t_w_firstyear.year_total > 0 and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end order by t_s_secyear.customer_id ,t_s_secyear.customer_first_name ,t_s_secyear.customer_last_name ,t_s_secyear.customer_email_address limit 100 Query submitted at: 2019-03-23 23:40:12 (Coordinator: http://remote-coordinator-node.mycompany.com:25000) ERROR: ImpalaRuntimeException: Error writing test case output to file: hdfs://namenode.mycompany.com:8020/test-warehouse/tpcds-testcase-data/impala-testcase-data-6430bc87-5337-4e65-b6aa-d059088f3a4b CAUSED BY: AccessControlException: Permission denied: user=impala, access=WRITE, inode="/test-warehouse/tpcds-testcase-data":hdfs:hdfs:drwxr-xr-x [...] Could not execute command: COPY TESTCASE TO '/test-warehouse/tpcds-testcase-data' -- start query 1 in stream 0 using template query11.tpl {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8091) Kudu SIGABRT'ed during dataload on Impala release build
David Knupp created IMPALA-8091: --- Summary: Kudu SIGABRT'ed during dataload on Impala release build Key: IMPALA-8091 URL: https://issues.apache.org/jira/browse/IMPALA-8091 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: David Knupp Assignee: Thomas Tauber-Marshall *Console log*: {noformat} 19:10:05 2019-01-16 19:10:05,461 - archive_core_dumps - INFO - Found core files: ['./core.1547693849.30799.kudu-master', './core.1547693849.30783.kudu-tserver', './core.1547693849.30767.kudu-tserver', './core.1547693849.30808.kudu-tserver'] 19:10:05 2019-01-16 19:10:05,649 - archive_core_dumps - INFO - [New LWP 30799] 19:10:05 [New LWP 30834] 19:10:05 [New LWP 30835] 19:10:05 [New LWP 30838] 19:10:05 [New LWP 30837] 19:10:05 [New LWP 30836] 19:10:05 Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. 19:10:05 Program terminated with signal SIGABRT, Aborted. 19:10:05 #0 0x7f9a2cc611f7 in ?? () 19:10:05 19:10:05 2019-01-16 19:10:05,650 - archive_core_dumps - INFO - Found binary path through GDB: /data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c 19:10:05 2019-01-16 19:10:05,893 - archive_core_dumps - WARNING - Failed to determine binary because multiple candidate binaries were found and none of their paths contained 'latest' to disambiguate: 19:10:05 Core:./core.1547693849.30799.kudu-master 19:10:05 Binaries:['./testdata/cluster/node_templates/common/etc/init.d/kudu-master', './testdata/cluster/cdh6/node-1/etc/init.d/kudu-master'] 19:10:05 19:10:05 2019-01-16 19:10:05,917 - archive_core_dumps - INFO - [New LWP 30783] 19:10:05 [New LWP 30810] 19:10:05 [New LWP 30812] 19:10:05 [New LWP 30811] 19:10:05 [New LWP 30824] 19:10:05 [New LWP 30820] 19:10:05 Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. 19:10:05 Program terminated with signal SIGABRT, Aborted. 19:10:05 #0 0x7f0b81fb11f7 in ?? () {noformat} *Backtraces*: {noformat} CORE: ./core.1547693849.30799.kudu-master BINARY: ./be/build/latest/service/impalad Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. Program terminated with signal SIGABRT, Aborted. #0 0x7f9a2cc611f7 in ?? () #0 0x7f9a2cc611f7 in ?? () #1 0x7f9a2cc628e8 in ?? () #2 0x0020 in ?? () #3 0x in ?? () CORE: ./core.1547693849.30783.kudu-tserver BINARY: ./be/build/latest/service/impalad Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. Program terminated with signal SIGABRT, Aborted. #0 0x7f0b81fb11f7 in ?? () #0 0x7f0b81fb11f7 in ?? () #1 0x7f0b81fb28e8 in ?? () #2 0x0020 in ?? () #3 0x in ?? () CORE: ./core.1547693849.30767.kudu-tserver BINARY: ./be/build/latest/service/impalad Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. Program terminated with signal SIGABRT, Aborted. #0 0x7f60b1e2f1f7 in ?? () #0 0x7f60b1e2f1f7 in ?? () #1 0x7f60b1e308e8 in ?? () #2 0x0020 in ?? () #3 0x in ?? () CORE: ./core.1547693849.30808.kudu-tserver BINARY: ./be/build/latest/service/impalad Core was generated by `/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/c'. Program terminated with signal SIGABRT, Aborted. #0 0x7fa3cb5591f7 in ?? () #0 0x7fa3cb5591f7 in ?? () #1 0x7fa3cb55a8e8 in ?? () #2 0x0020 in ?? () #3 0x in ?? () {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8090) disk-io-mgr-test SIGABRT'ed on centos6 exhaustive test run
David Knupp created IMPALA-8090: --- Summary: disk-io-mgr-test SIGABRT'ed on centos6 exhaustive test run Key: IMPALA-8090 URL: https://issues.apache.org/jira/browse/IMPALA-8090 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: David Knupp Assignee: Tim Armstrong *Test output*: {noformat} 45/99 Test #45: disk-io-mgr-test .***Exception: Other 43.29 sec Turning perftools heap leak checking off [==] Running 25 tests from 1 test case. [--] Global test environment set-up. [--] 25 tests from DiskIoMgrTest [ RUN ] DiskIoMgrTest.SingleWriter 19/01/16 15:57:09 INFO util.JvmPauseMonitor: Starting JVM pause monitor [ OK ] DiskIoMgrTest.SingleWriter (3407 ms) [ RUN ] DiskIoMgrTest.InvalidWrite [ OK ] DiskIoMgrTest.InvalidWrite (281 ms) [ RUN ] DiskIoMgrTest.WriteErrors [ OK ] DiskIoMgrTest.WriteErrors (235 ms) [ RUN ] DiskIoMgrTest.SingleWriterCancel [ OK ] DiskIoMgrTest.SingleWriterCancel (1165 ms) [ RUN ] DiskIoMgrTest.SingleReader [ OK ] DiskIoMgrTest.SingleReader (5835 ms) [ RUN ] DiskIoMgrTest.SingleReaderSubRanges [ OK ] DiskIoMgrTest.SingleReaderSubRanges (16404 ms) [ RUN ] DiskIoMgrTest.AddScanRangeTest [ OK ] DiskIoMgrTest.AddScanRangeTest (1210 ms) [ RUN ] DiskIoMgrTest.SyncReadTest *** Check failure stack trace: *** @ 0x4825dcc @ 0x4827671 @ 0x48257a6 @ 0x4828d6d @ 0x1af39ec @ 0x1ae90a4 @ 0x1ac30ea @ 0x1accad3 @ 0x1acc660 @ 0x1acbf3e @ 0x1acb62d @ 0x1b03671 @ 0x1f79988 @ 0x1f82b60 @ 0x1f82a84 @ 0x1f82a47 @ 0x3751579 @ 0x3ea4807850 @ 0x3ea44e894c Wrote minidump to /data/jenkins/workspace/<...>/repos/Impala/logs/be_tests/minidumps/disk-io-mgr-test/5bbf76f7-e5d6-4ac9-bdae9d9b-065c32ec.dmp {noformat} *Error*: {noformat} Operating system: Linux 0.0.0 Linux 2.6.32-358.14.1.el6.centos.plus.x86_64 #1 SMP Tue Jul 16 21:33:24 UTC 2013 x86_64 CPU: amd64 family 6 model 45 stepping 7 8 CPUs GPU: UNKNOWN Crash reason: SIGABRT Crash address: 0x4522fa1 Process uptime: not available Thread 205 (crashed) 0 libc-2.12.so + 0x328e5 rax = 0x rdx = 0x0006 rcx = 0x rbx = 0x06adf9c0 rsi = 0x0563 rdi = 0x2fa1 rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc78 r8 = 0x7f8009b8fd00r9 = 0x0563 r10 = 0x0008 r11 = 0x0202 r12 = 0x06adfa40 r13 = 0x001f r14 = 0x06ae7384 r15 = 0x06adf9c0 rip = 0x003ea44328e5 Found by: given as instruction pointer in context 1 libc-2.12.so + 0x340c5 rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc80 rip = 0x003ea44340c5 Found by: stack scanning 2 disk-io-mgr-test!boost::_bi::bind_t, boost::_bi::list2, boost::_bi::value > >::operator()() [bind_template.hpp : 20 + 0x21] rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc88 rip = 0x01acbf3e Found by: stack scanning 3 disk-io-mgr-test!google::LogMessage::Flush() + 0x157 rbx = 0x0007 rbp = 0x06adf980 rsp = 0x7f8009b8fff0 rip = 0x048257a7 Found by: call frame info 4 disk-io-mgr-test!google::LogMessageFatal::~LogMessageFatal() + 0xe rbx = 0x7f8009b90110 rbp = 0x7f8009b903f0 rsp = 0x7f8009b90070 r12 = 0x0001 r13 = 0x06aee8b8 r14 = 0x0c213538 r15 = 0x0007 rip = 0x04828d6e Found by: call frame info 5 disk-io-mgr-test!impala::io::LocalFileReader::ReadFromPos(long, unsigned char*, long, long*, bool*) [local-file-reader.cc : 67 + 0x10] rbx = 0x0001 rbp = 0x7f8009b903f0 rsp = 0x7f8009b90090 r12 = 0x0001 r13 = 0x06aee8b8 r14 = 0x0c213538 r15 = 0x0007 rip = 0x01af39ed Found by: call frame info 6 disk-io-mgr-test!impala::io::ScanRange::DoRead(int) [scan-range.cc : 219 + 0x5b] rbx = 0x0c4f71e0 rbp = 0x7f8009b90620 rsp = 0x7f8009b90400 r12 = 0x01af36e4 r13 = 0x000d r14 = 0x0c213538 r15 = 0x0007 rip = 0x01ae90a5 Found by: call frame info 7 disk-io-mgr-test!impala::io::DiskQueue::DiskThreadLoop(impala::io::DiskIoMgr*) [disk-io-mgr.cc : 425 + 0x17] rbx = 0x0c0e0f00 rbp = 0x7f8009b906c0 rsp = 0x7f8009b90630 r12 = 0x7fff99de21c0 r13 = 0x7fff99de1a90 r14 = 0x0
[jira] [Created] (IMPALA-8089) Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"
David Knupp created IMPALA-8089: --- Summary: Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID" Key: IMPALA-8089 URL: https://issues.apache.org/jira/browse/IMPALA-8089 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.2.0 Reporter: David Knupp Assignee: Bikramjeet Vig Example failure at: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/4113/consoleFull {noformat} 04:32:09 ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID 04:32:09 Generated: /home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.run-all-tests.20190115_04_32_09.xml 04:32:09 + RET_CODE=1 {noformat} Still looking, but I don't see any other obvious issues right now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8078) test_corrupt_stats failing on exhaustive builds
David Knupp created IMPALA-8078: --- Summary: test_corrupt_stats failing on exhaustive builds Key: IMPALA-8078 URL: https://issues.apache.org/jira/browse/IMPALA-8078 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: David Knupp Stacktrace: {noformat} metadata/test_compute_stats.py:222: in test_corrupt_stats self.run_test_case('QueryTest/corrupt-stats', vector, unique_database) common/impala_test_suite.py:497: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:359: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:449: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:239: in verify_query_result_is_subset assert expected_literal_strings <= actual_literal_strings E assert Items in expected results not found in actual results: E ' partitions=1/2 files=1 size=24B row-size=0B cardinality=0' {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs
David Knupp created IMPALA-7804: --- Summary: Various scanner tests intermittently failing on S3 on different runs Key: IMPALA-7804 URL: https://issues.apache.org/jira/browse/IMPALA-7804 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: David Knupp Th failures have to do with getting AWS client credentials. *query_test/test_scanners.py:696: in test_decimal_encodings* _Stacktrace_ {noformat} query_test/test_scanners.py:696: in test_decimal_encodings self.run_test_case('QueryTest/parquet-decimal-formats', vector, unique_database) common/impala_test_suite.py:496: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:358: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:438: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:260: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00 E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00 E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99 E -65535.00,-65535.00,-65535.00 != 0.00,-.99,-.99 E -999.99,-999.99,-999.99 != 0.00,0.00,0.00 E -999.99,-999.99,-999.99 != 0.00,.99,.99 E 0.00,-.99,-.99 != 255.00,255.00,255.00 E 0.00,-.99,-.99 != 65535.00,65535.00,65535.00 E 0.00,0.00,0.00 != 999.99,999.99,999.99 E 0.00,0.00,0.00 != None E 0.00,.99,.99 != None E 0.00,.99,.99 != None E 255.00,255.00,255.00 != None E 255.00,255.00,255.00 != None E 65535.00,65535.00,65535.00 != None E 65535.00,65535.00,65535.00 != None E 999.99,999.99,999.99 != None E 999.99,999.99,999.99 != None E Number of rows returned (expected vs actual): 18 != 9 {noformat} _Standard Error_ {noformat} SET sync_ddl=False; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE; -- 2018-11-01 09:42:41,140 INFO MainThread: Started query 4c4bc0e7b69d7641:130ffe73 SET sync_ddl=False; -- executing against localhost:21000 CREATE DATABASE `test_huge_num_rows_76a09ef1`; -- 2018-11-01 09:42:42,402 INFO MainThread: Started query e34d714d6a62cba1:2a8544d0 -- 2018-11-01 09:42:42,405 INFO MainThread: Created database "test_huge_num_rows_76a09ef1" for test ID "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for impala-test-uswest2-1 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under fs.s3a.bucket.impala-test-uswest2-1. 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system started 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is 1500 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.establish.timeout is 5000 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.timeout is 20 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.socket.send.buffer is 8192 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.socket.recv.buffer is 8192 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Using User-Agent: Hadoop 3.0.0-cdh6.x-SNAPSHOT 18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.paging.maximum is 5000 18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.block.size is 33554432 18/11/01 09:42:44 DEBUG s3a.S3AUtils: Value of fs.s3a.readahead.range is 65536 18/11/01 09:42:44 DEBUG
[jira] [Created] (IMPALA-7803) PlannerTest.testHbase failing on centos6 exhaustive test run
David Knupp created IMPALA-7803: --- Summary: PlannerTest.testHbase failing on centos6 exhaustive test run Key: IMPALA-7803 URL: https://issues.apache.org/jira/browse/IMPALA-7803 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.1.0 Reporter: David Knupp *Error Message* {noformat} section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id > '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 5\0: NODE 0: Expected: HBASE KEYRANGE 5\0:7 HBASE KEYRANGE 7: NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id >= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 5: ^^ NODE 0: Expected: HBASE KEYRANGE 5:7 HBASE KEYRANGE 7: NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id > '4' and id <= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 4\0:5 ^^ HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4\0:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id >= '4' and id <= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5' Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where string_col = '4' and tinyint_col = 5 and id >= concat('', '4') and id <= concat('5', '') Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.alltypesagg where bigint_col is not null and bool_col = true Actual does not match expected result: HBASE KEYRANGE 3:5 HBASE KEYRANGE 5: HBASE KEYRANGE :3 NODE 0: Expected: HBASE KEYRANGE 3:7 HBASE KEYRANGE 7: HBASE KEYRANGE :3 NODE 0: Stacktrace java.lang.AssertionError: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id > '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 5\0: NODE 0: Expected: HBASE KEYRANGE 5\0:7 HBASE KEYRANGE 7: NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id >= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 5: ^^ NODE 0: Expected: HBASE KEYRANGE 5:7 HBASE KEYRANGE 7: NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id > '4' and id <= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 4\0:5 ^^ HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4\0:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where id >= '4' and id <= '5' and tinyint_col = 5 Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5' Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.stringids where string_col = '4' and tinyint_col = 5 and id >= concat('', '4') and id <= concat('5', '') Actual does not match expected result: HBASE KEYRANGE 4:5 HBASE KEYRANGE 5:5\0 NODE 0: Expected: HBASE KEYRANGE 4:5\0 NODE 0: section SCANRANGELOCATIONS of query: select * from functional_hbase.alltypesagg where bigint_col is not null and bool_col = true Actual does not match expected result: HBASE KEYRANGE 3:5 HBASE KEYRANGE 5: HBASE KEYRANGE :3 NODE 0: Expected: HBASE KEYRANGE 3:7 HBASE KEYRANGE 7: HBASE KEYRANGE :3 NODE 0: at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:857) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:862) at org.apache.impala.planner.PlannerTest.testHbase(PlannerTest.java:126) {noformat}
[jira] [Created] (IMPALA-7798) session-expiry-test passed with a minidump in an ASAN build
David Knupp created IMPALA-7798: --- Summary: session-expiry-test passed with a minidump in an ASAN build Key: IMPALA-7798 URL: https://issues.apache.org/jira/browse/IMPALA-7798 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.1.0 Reporter: David Knupp Noticed this from a recent ASF master ASAN test *Standard Error* {noformat} Operating system: Linux 0.0.0 Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 CPU: amd64 family 6 model 79 stepping 1 16 CPUs GPU: UNKNOWN Crash reason: SIGABRT Crash address: 0x7d10001d333 Process uptime: not available Thread 54 (crashed) 0 libc-2.17.so + 0x351f7 rax = 0x rdx = 0x0006 rcx = 0x rbx = 0x60c000236818 rsi = 0x0001d36d rdi = 0x0001d333 rbp = 0x0841aff0 rsp = 0x7f5dba815dd8 r8 = 0x7f5dba818570r9 = 0x7f5dc8a44000 r10 = 0x0008 r11 = 0x0206 r12 = 0x60c000236740 r13 = 0x6110001801c0 r14 = 0x7f5dba815f80 r15 = 0x0febb7502bf0 rip = 0x7f65ef5b31f7 Found by: given as instruction pointer in context 1 libc-2.17.so + 0x368e8 rsp = 0x7f5dba815de0 rip = 0x7f65ef5b48e8 Found by: stack scanning 2 session-expiry-test!__interceptor___tls_get_addr [sanitizer_common_interceptors.inc : 4723 + 0xb] rsp = 0x7f5dba815ed8 rip = 0x0165aa2b Found by: stack scanning 3 0x60c000236818 rbx = 0x60c000236818 rbp = 0x0841aff0 rsp = 0x7f5dba815ef8 rip = 0x60c000236818 Found by: call frame info 4 libstdc++.so.6.0.20 + 0x5fd1d rsp = 0x7f5dba815f10 rip = 0x7f65f00d5d1d Found by: stack scanning 5 session-expiry-test + 0x136f0d0 rsp = 0x7f5dba815f30 rip = 0x0176f0d0 Found by: stack scanning 6 libstdc++.so.6.0.20 + 0x5dd86 rsp = 0x7f5dba815f40 rip = 0x7f65f00d3d86 Found by: stack scanning 7 libstdc++.so.6.0.20 + 0x5ddd1 rsp = 0x7f5dba815f50 rip = 0x7f65f00d3dd1 Found by: stack scanning 8 libstdc++.so.6.0.20 + 0x5dfe8 rsp = 0x7f5dba815f60 rip = 0x7f65f00d3fe8 Found by: stack scanning 9 session-expiry-test!void boost::throw_exception(boost::lock_error const&) [throw_exception.hpp : 69 + 0x22] rsp = 0x7f5dba815f80 rip = 0x0176ef8a Found by: stack scanning 10 session-expiry-test!_fini + 0x8470 rsp = 0x7f5dba815f90 rip = 0x04ce7670 Found by: stack scanning 11 session-expiry-test + 0x136ee70 rsp = 0x7f5dba815f98 rip = 0x0176ee70 Found by: stack scanning 12 session-expiry-test!_fini + 0x3d2b60 rsp = 0x7f5dba816018 rip = 0x050b1d60 Found by: stack scanning 13 session-expiry-test!__asan_handle_no_return [asan_rtl.cc : 670 + 0xa] rsp = 0x7f5dba816060 rip = 0x0171fd38 Found by: stack scanning 14 session-expiry-test!boost::mutex::lock() [mutex.hpp : 119 + 0xd] rbx = 0x7f5dba8160a0 rbp = 0x7f5dba8160c0 rsp = 0x7f5dba8160a0 r12 = 0x7f5dba816190 rip = 0x0176fb73 Found by: call frame info 15 0x60700019a868 rbx = 0x0176fb73 rbp = 0x0759e368 rsp = 0x7f5dba8160d0 r12 = 0x41b58ab3 r13 = 0x04ce76dd r14 = 0x0176fa90 r15 = 0x01d11846 rip = 0x60700019a868 Found by: call frame info 16 session-expiry-test!_fini + 0x57c5b rsp = 0x7f5dba8160f0 rip = 0x04d36e5b Found by: stack scanning 17 session-expiry-test + 0x179c460 rsp = 0x7f5dba8160f8 rip = 0x01b9c460 Found by: stack scanning 18 session-expiry-test!boost::unique_lock::~unique_lock() [lock_types.hpp : 329 + 0x5] rsp = 0x7f5dba816160 rip = 0x0176dece Found by: stack scanning 19 session-expiry-test!impala::StatsMetric::Update(double const&) [collection-metrics.h : 150 + 0x8] rsp = 0x7f5dba8161a0 rip = 0x01d6d1a2 Found by: stack scanning 20 session-expiry-test!_fini + 0x78173 rsp = 0x7f5dba8161b0 rip = 0x04d57373 Found by: stack scanning 21 session-expiry-test + 0x196d100 rsp = 0x7f5dba8161b8 rip = 0x01d6d100 Found by: stack scanning 22 session-expiry-test!std::vector >::end() [stl_vector.h : 566 + 0xb] rsp = 0x7f5dba8161c0 rip = 0x01d53c5e Found by: stack scanning 23 session-expiry-test + 0x1953bc0 rsp = 0x7f5dba8161d8 rip = 0x01d53bc0 Found by: stack scanning 24 session-expiry-test!impala::Statestore::SendTopicUpdate(impala::Statestore::Subscriber*, impala::Statestore::UpdateKind, bool*) [statestore.cc : 753 + 0x9] {noformat} Note however: {noformat} 20:02:50 Start 49: session-expiry-test 20:0
[jira] [Resolved] (IMPALA-7796) TestAutomaticCatalogInvalidation custom cluster suite failing for both local and V1
[ https://issues.apache.org/jira/browse/IMPALA-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-7796. - Resolution: Duplicate > TestAutomaticCatalogInvalidation custom cluster suite failing for both local > and V1 > --- > > Key: IMPALA-7796 > URL: https://issues.apache.org/jira/browse/IMPALA-7796 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: David Knupp >Assignee: Vuk Ercegovac >Priority: Critical > > Both variants exceed max wait time. > v1 catalog: > {noformat} > custom_cluster/test_automatic_invalidation.py:69: in test_v1_catalog > self._run_test(cursor) > custom_cluster/test_automatic_invalidation.py:64: in _run_test > assert time.time() < max_wait_time > E assert 1541000646.910642 < 1541000646.673253 > E+ where 1541000646.910642 = () > E+where = time.time > {noformat} > Local catalog > {noformat} > custom_cluster/test_automatic_invalidation.py:76: in test_local_catalog > self._run_test(cursor) > custom_cluster/test_automatic_invalidation.py:64: in _run_test > assert time.time() < max_wait_time > E assert 1541000679.388713 < 1541000679.148656 > E+ where 1541000679.388713 = () > E+where = time.time > {noformat} > Additionally, the v1 catalog test seemed to experience some connectivity > issues: > {noformat} > -- 2018-10-31 08:44:18,118 INFO MainThread: num_known_live_backends has > reached value: 3 > -- connecting to: localhost:21000 > -- connecting to localhost:21050 with impyla > Conn > -- 2018-10-31 08:44:18,214 INFO MainThread: Closing active operation > -- 2018-10-31 08:44:18,215 ERRORMainThread: Failed to open transport > (tries_left=3) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py", > line 940, in _execute > return func(request) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", > line 175, in OpenSession > return self.recv_OpenSession() > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", > line 186, in recv_OpenSession > (fname, mtype, rseqid) = self._iprot.readMessageBegin() > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", > line 126, in readMessageBegin > sz = self.readI32() > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", > line 206, in readI32 > buff = self.trans.readAll(4) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", > line 58, in readAll > chunk = self.read(sz - have) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", > line 159, in read > self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size))) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py", > line 120, in read > message='TSocket read 0 bytes') > TTransportException: TSocket read 0 bytes > -- 2018-10-31 08:44:19,133 INFO MainThread: Starting new HTTP connection > (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com > -- 2018-10-31 08:44:20,150 INFO MainThread: Starting new HTTP connection > (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7796) TestAutomaticCatalogInvalidation custom cluster suite failing for both local and V1
David Knupp created IMPALA-7796: --- Summary: TestAutomaticCatalogInvalidation custom cluster suite failing for both local and V1 Key: IMPALA-7796 URL: https://issues.apache.org/jira/browse/IMPALA-7796 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.1.0 Reporter: David Knupp Assignee: Vuk Ercegovac Both variants exceed max wait time. v1 catalog: {noformat} custom_cluster/test_automatic_invalidation.py:69: in test_v1_catalog self._run_test(cursor) custom_cluster/test_automatic_invalidation.py:64: in _run_test assert time.time() < max_wait_time E assert 1541000646.910642 < 1541000646.673253 E+ where 1541000646.910642 = () E+where = time.time {noformat} Local catalog {noformat} custom_cluster/test_automatic_invalidation.py:76: in test_local_catalog self._run_test(cursor) custom_cluster/test_automatic_invalidation.py:64: in _run_test assert time.time() < max_wait_time E assert 1541000679.388713 < 1541000679.148656 E+ where 1541000679.388713 = () E+where = time.time {noformat} Additionally, the v1 catalog test seemed to experience some connectivity issues: {noformat} -- 2018-10-31 08:44:18,118 INFO MainThread: num_known_live_backends has reached value: 3 -- connecting to: localhost:21000 -- connecting to localhost:21050 with impyla Conn -- 2018-10-31 08:44:18,214 INFO MainThread: Closing active operation -- 2018-10-31 08:44:18,215 ERRORMainThread: Failed to open transport (tries_left=3) Traceback (most recent call last): File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py", line 940, in _execute return func(request) File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 175, in OpenSession return self.recv_OpenSession() File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 186, in recv_OpenSession (fname, mtype, rseqid) = self._iprot.readMessageBegin() File "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin sz = self.readI32() File "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32 buff = self.trans.readAll(4) File "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll chunk = self.read(sz - have) File "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 159, in read self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size))) File "/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 120, in read message='TSocket read 0 bytes') TTransportException: TSocket read 0 bytes -- 2018-10-31 08:44:19,133 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com -- 2018-10-31 08:44:20,150 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-02bb.vpc.cloudera.com {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7783) test_default_timezone failing on real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-7783. - Resolution: Fixed > test_default_timezone failing on real cluster > - > > Key: IMPALA-7783 > URL: https://issues.apache.org/jira/browse/IMPALA-7783 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: David Knupp >Priority: Major > > shell/test_shell_commandline.py/test_default_timezone is failing due to > issues in asserting zoneinfo/tzname > {noformat} > shell/test_shell_commandline.py:715: in test_default_timezone > assert os.path.isfile("/usr/share/zoneinfo/" + tzname) > E assert (('/usr/share/zoneinfo/' + > 'SystemV/PST8PDT')) > E+ where = '/data0/jenkins/workspace/Quasar-Executor/testing/inf...Impala-cdh-cluster-test-runner/infra/python/env/lib64/python2.7/posixpath.pyc'>.isfile > E+where '/data0/jenkins/workspace/Quasar-Executor/testing/inf...Impala-cdh-cluster-test-runner/infra/python/env/lib64/python2.7/posixpath.pyc'> > = os.path {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7783) test_default_timezone failing on real cluster
David Knupp created IMPALA-7783: --- Summary: test_default_timezone failing on real cluster Key: IMPALA-7783 URL: https://issues.apache.org/jira/browse/IMPALA-7783 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.1.0 Reporter: David Knupp shell/test_shell_commandline.py/test_default_timezone is failing due to issues in asserting zoneinfo/tzname {noformat} shell/test_shell_commandline.py:715: in test_default_timezone assert os.path.isfile("/usr/share/zoneinfo/" + tzname) E assert (('/usr/share/zoneinfo/' + 'SystemV/PST8PDT')) E+ where = .isfile E+where = os.path {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION
[ https://issues.apache.org/jira/browse/IMPALA-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-7758. - Resolution: Fixed Fix Version/s: Impala 3.1.0 > chars_formats dependent tables are created using the wrong LOCATION > --- > > Key: IMPALA-7758 > URL: https://issues.apache.org/jira/browse/IMPALA-7758 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 3.1.0 > > > In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating > the various chars_formats tables (e.g. text) use: > {noformat} > LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text' > {noformat} > ...which resolves to {{/user/hive/warehouse/chars_formats_text}} > However, the actual test warehouse root dir is {{/test-warehouse}}, not > {{/user/hive/warehouse}}. > {noformat} > $ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt > abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable > length > abc > ,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc > abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1 > {noformat} > versus... > {noformat} > $ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt > cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such > file or directory > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-1780) Catch exceptions thrown by UDFs
[ https://issues.apache.org/jira/browse/IMPALA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-1780. - Resolution: Won't Fix > Catch exceptions thrown by UDFs > --- > > Key: IMPALA-1780 > URL: https://issues.apache.org/jira/browse/IMPALA-1780 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.1.1, Impala 2.3.0 >Reporter: Henry Robinson >Priority: Major > Labels: crash, downgraded > > Catch exceptions thrown by UDFs so Impala doesn't crash. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-3959) data loading jenkins jobs don't save test logs
[ https://issues.apache.org/jira/browse/IMPALA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-3959. - Resolution: Invalid > data loading jenkins jobs don't save test logs > -- > > Key: IMPALA-3959 > URL: https://issues.apache.org/jira/browse/IMPALA-3959 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 2.7.0 >Reporter: Michael Brown >Priority: Critical > > Even though data loading jobs run BE, FE, and core EE tests, the logs for > these tests are not saved. (The only artifacts saved are those used by > snapshot consumers: the snapshot, metastore snapshot, and git hash). This is > a problem when there are flaky tests that fail there that we haven't seen > fail elsewhere: we have no forensic evidence to search through for clues. > Example: > http://sandbox.jenkins.cloudera.com/job/impala-asf-master-core-data-load/29/ > {noformat} > 22:55:09 99% tests passed, 1 tests failed out of 78 > 22:55:09 > 22:55:09 The following tests FAILED: > 22:55:09 13 - kudu-scan-node-test (OTHER_FAULT) > 22:55:09 Errors while running CTest > 22:55:09 make: *** [test] Error 8 > {noformat} > This kudu scan node test failed, but we have no other info on it, because we > have no artifacts. > Part of the problem is that the data load job has a separate entry point, so > everything built up in {{Impala-aux/jenkins/build.sh}} to handle archiving > doesn't exist for {{Impala-aux/jenkins/build-data-load.sh}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IMPALA-6814) query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters
[ https://issues.apache.org/jira/browse/IMPALA-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp closed IMPALA-6814. --- Resolution: Cannot Reproduce Apparently not. The actual line was apparently {{row_regex: .*Error parsing row: file: $NAMENODE/.* before offset: \d+}}, and $NAMENODE was resolving to "localhost." Some other change somewhere must have fixed that, because it's now resolving properly. > query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on > remote clusters > - > > Key: IMPALA-6814 > URL: https://issues.apache.org/jira/browse/IMPALA-6814 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 2.12.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Critical > > It looks like {{localhost}} is hardcoded in the test verification. > > *Stacktrace* > query_test/test_queries.py:161: in test_strict_mode > self.run_test_case('QueryTest/strict-mode', vector) > common/impala_test_suite.py:427: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:300: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:317: in verify_raw_results > verify_errors(expected_errors, actual_errors) > common/test_result_verifier.py:274: in verify_errors > VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual) > common/test_result_verifier.py:231: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > [...] > E row_regex: .*Error parsing row: file: > hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != > 'Error parsing row: file: > hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: > 454' > E row_regex: .*Error parsing row: file: > hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != > 'Error parsing row: file: > hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: > 454' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION
David Knupp created IMPALA-7758: --- Summary: chars_formats dependent tables are created using the wrong LOCATION Key: IMPALA-7758 URL: https://issues.apache.org/jira/browse/IMPALA-7758 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.1.0 Reporter: David Knupp In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating the various chars_formats tables (e.g. text) use: {noformat} LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text' {noformat} ...which resolves to {{/user/hive/warehouse/chars_formats_text}} However, the actual test warehouse root dir is {{/test-warehouse}}, not {{/user/hive/warehouse}}. {noformat} $ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable length abc ,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1 {noformat} versus... {noformat} $ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such file or directory {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IMPALA-7584) test_set fails when run against external cluster
[ https://issues.apache.org/jira/browse/IMPALA-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp closed IMPALA-7584. --- Resolution: Fixed Change submitted. https://gerrit.cloudera.org/c/11476/ > test_set fails when run against external cluster > > > Key: IMPALA-7584 > URL: https://issues.apache.org/jira/browse/IMPALA-7584 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: David Knupp >Priority: Major > > Similar to IMPALA-6810, test_set fails: > {noformat} > E AssertionError: Unexpected exception string. Expected: Rejected query > from pool default-pool: minimum memory reservation > E Not found in actual: ImpalaBeeswaxException: Query aborted:Rejected query > from pool root.jenkins: minimum memory reservation is greater than memory > available to the query for buffer reservations. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7399) Add a script for generating junit XML type output for arbitrary build steps
[ https://issues.apache.org/jira/browse/IMPALA-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-7399. - Resolution: Fixed > Add a script for generating junit XML type output for arbitrary build steps > --- > > Key: IMPALA-7399 > URL: https://issues.apache.org/jira/browse/IMPALA-7399 > Project: IMPALA > Issue Type: New Feature > Components: Infrastructure >Affects Versions: Not Applicable >Reporter: David Knupp >Priority: Major > > Junit XML has become a defacto standard for outputting automated test > results. Jenkins consumes junit XML output to generate final test reports. > This makes triaging failed builds much easier. > While Impala's test frameworks already produce junit XML, it would be nice to > take produce junit XML for earlier build steps that aren't formally tests, > e.g., compilation and data loading. This will make it easier to diagnose > these failures on jobs that run on jenkins.impala.io (as opposed to requiring > users to read through raw console output.) > This script is also being used as a starting point for setting up a formal > internal python library for Impala development that can be installed via > {{pip install -e }}. We expect other packages to > be added to this library over time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7399) Add a script for generating junit XML type output for arbitrary build steps
David Knupp created IMPALA-7399: --- Summary: Add a script for generating junit XML type output for arbitrary build steps Key: IMPALA-7399 URL: https://issues.apache.org/jira/browse/IMPALA-7399 Project: IMPALA Issue Type: New Feature Components: Infrastructure Affects Versions: Not Applicable Reporter: David Knupp Junit XML has become a defacto standard for outputting automated test results. Jenkins consumes junit XML output to generate final test reports. This makes triaging failed builds much easier. While Impala's test frameworks already produce junit XML, it would be nice to take produce junit XML for earlier build steps that aren't formally tests, e.g., compilation and data loading. This will make it easier to diagnose these failures on jobs that run on jenkins.impala.io (as opposed to requiring users to read through raw console output.) This script is also being used as a starting point for setting up a formal internal python library for Impala development that can be installed via {{pip install -e }}. We expect other packages to be added to this library over time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7379) test_random_rpc_timeout failed on exhaustive build: Debug webpage did not become available in expected time
David Knupp created IMPALA-7379: --- Summary: test_random_rpc_timeout failed on exhaustive build: Debug webpage did not become available in expected time Key: IMPALA-7379 URL: https://issues.apache.org/jira/browse/IMPALA-7379 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: David Knupp Assignee: Michael Ho *Stacktrace* {noformat} custom_cluster/test_rpc_timeout.py:131: in test_random_rpc_timeout self.execute_query_verify_metrics(self.TEST_QUERY, 10) custom_cluster/test_rpc_timeout.py:51: in execute_query_verify_metrics v.wait_for_metric("impala-server.num-fragments-in-flight", 0) verifiers/metric_verifier.py:62: in wait_for_metric self.impalad_service.wait_for_metric_value(metric_name, expected_value, timeout) common/impala_service.py:132: in wait_for_metric_value json.dumps(self.read_debug_webpage('memz?json')), common/impala_service.py:63: in read_debug_webpage return self.open_debug_webpage(page_name, timeout=timeout, interval=interval).read() common/impala_service.py:60: in open_debug_webpage assert 0, 'Debug webpage did not become available in expected time.' E AssertionError: Debug webpage did not become available in expected time. {noformat} *Standard Error* {noformat} 10:27:58 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO 10:27:58 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 10:27:59 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO 10:28:00 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 10:28:01 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 10:28:04 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 10:28:04 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000 10:28:04 MainThread: Waiting for num_known_live_backends=3. Current value: 1 10:28:05 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000 10:28:05 MainThread: Waiting for num_known_live_backends=3. Current value: 2 10:28:06 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000 10:28:06 MainThread: num_known_live_backends has reached value: 3 10:28:06 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25001 10:28:06 MainThread: num_known_live_backends has reached value: 3 10:28:06 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25002 10:28:06 MainThread: num_known_live_backends has reached value: 3 10:28:06 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors). MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25010 MainThread: Metric 'statestore.live-backends' has reached desired value: 4 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25000 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25001 MainThread: num_known_live_backends has reached value: 3 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0458.vpc.cloudera.com:25002 MainThread: num_known_live_backends has reached value: 3 -- connecting to: localhost:21000 -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functional.alltypessmall c2; -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functional.alltypessmall c2; -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functional.alltypessmall c2; -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functional.alltypessmall c2; -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functional.alltypessmall c2; -- executing against localhost:21000 select count(c2.string_col) from functional.alltypestiny join functiona
[jira] [Created] (IMPALA-7378) test_strict_mode failed on an ASAN build: expected "Error converting column: 5 to DOUBLE"
David Knupp created IMPALA-7378: --- Summary: test_strict_mode failed on an ASAN build: expected "Error converting column: 5 to DOUBLE" Key: IMPALA-7378 URL: https://issues.apache.org/jira/browse/IMPALA-7378 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: David Knupp Assignee: Tim Armstrong *Stacktrace* {noformat} query_test/test_queries.py:159: in test_strict_mode self.run_test_case('QueryTest/strict-mode-abort', vector) common/impala_test_suite.py:420: in run_test_case assert False, "Expected exception: %s" % expected_str E AssertionError: Expected exception: Error converting column: 5 to DOUBLE {noformat} *Standard Error* {noformat} -- executing against localhost:21000 use functional; SET strict_mode=1; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=0; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 select * from overflow; -- executing against localhost:21000 use functional; SET strict_mode=1; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 select tinyint_col from overflow; -- executing against localhost:21000 select smallint_col from overflow; -- executing against localhost:21000 select int_col from overflow; -- executing against localhost:21000 select bigint_col from overflow; -- executing against localhost:21000 select float_col from overflow; -- executing against localhost:21000 select double_col from overflow; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7371) TestInsertQueries.test_insert fails on S3 with 0 rows returned
David Knupp created IMPALA-7371: --- Summary: TestInsertQueries.test_insert fails on S3 with 0 rows returned Key: IMPALA-7371 URL: https://issues.apache.org/jira/browse/IMPALA-7371 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: David Knupp Stacktrace {noformat} query_test/test_insert.py:118: in test_insert multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1) /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:426: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:299: in __verify_results_and_errors replace_filenames_with_placeholder) /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:434: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:261: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E 75,false,0,0,0,0,0,0,'04/01/09','0' != None E 76,true,1,1,1,10,1.10023841858,10.1,'04/01/09','1' != None E 77,false,2,2,2,20,2.20047683716,20.2,'04/01/09','2' != None E 78,true,3,3,3,30,3.29952316284,30.3,'04/01/09','3' != None E 79,false,4,4,4,40,4.40095367432,40.4,'04/01/09','4' != None E 80,true,5,5,5,50,5.5,50.5,'04/01/09','5' != None E 81,false,6,6,6,60,6.59904632568,60.6,'04/01/09','6' != None E 82,true,7,7,7,70,7.69809265137,70.7,'04/01/09','7' != None E 83,false,8,8,8,80,8.80190734863,80.8,'04/01/09','8' != None E 84,true,9,9,9,90,9.89618530273,90.91,'04/01/09','9' != None E 85,false,0,0,0,0,0,0,'04/02/09','0' != None E 86,true,1,1,1,10,1.10023841858,10.1,'04/02/09','1' != None E 87,false,2,2,2,20,2.20047683716,20.2,'04/02/09','2' != None E 88,true,3,3,3,30,3.29952316284,30.3,'04/02/09','3' != None E 89,false,4,4,4,40,4.40095367432,40.4,'04/02/09','4' != None E 90,true,5,5,5,50,5.5,50.5,'04/02/09','5' != None E 91,false,6,6,6,60,6.59904632568,60.6,'04/02/09','6' != None E 92,true,7,7,7,70,7.69809265137,70.7,'04/02/09','7' != None E 93,false,8,8,8,80,8.80190734863,80.8,'04/02/09','8' != None E 94,true,9,9,9,90,9.89618530273,90.91,'04/02/09','9' != None E 95,false,0,0,0,0,0,0,'04/03/09','0' != None E 96,true,1,1,1,10,1.10023841858,10.1,'04/03/09','1' != None E 97,false,2,2,2,20,2.20047683716,20.2,'04/03/09','2' != None E 98,true,3,3,3,30,3.29952316284,30.3,'04/03/09','3' != None E 99,false,4,4,4,40,4.40095367432,40.4,'04/03/09','4' != None E Number of rows returned (expected vs actual): 25 != 0 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7155) Make it designate a large number of arbitrary tests for targeted test runs
David Knupp created IMPALA-7155: --- Summary: Make it designate a large number of arbitrary tests for targeted test runs Key: IMPALA-7155 URL: https://issues.apache.org/jira/browse/IMPALA-7155 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 3.0 Reporter: David Knupp It's already possible to specify an arbitrary list of test modules, test classes, and/or test functions as command line parameters when running the Impala mini-cluster tests. It's also possible to opt-out of running specific tests by applying any of a variety of skipif markers. What we don't have is a comprehensive way for tests to be opted-in to a targeted test run, other than by naming it as a command line parameter. This becomes extremely unwieldy beyond a certain number of tests. In fact, we don't have a general concept of targeted test runs at all. The approach to date has been to always run as many tests as possible, except for those tests specifically marked for skipping. This is a OK way to make sure tests don't get overlooked, but it also results in many tests frequently being run in contexts in which they don't necessarily apply, e.g. against S3, or against actual deployed clusters, which can lead to false negatives. There are different ways that we could group together a disparate array of tests into a targeted run. We could come up with a permanent series of new pytest markers/decorators for opting-in, as opposed to opting-out, of a given test run. An initial pass would then need to be made to apply the new decorators as needed to all of the existing tests. One could then invoke something like "impala-pytest -m cluster_tests" as needed. Another approach might be to define test runs in special files (probably yaml). The file would include a list of which tests to run, possibly along with other test parameters, e.g. "run this list of tests, but only on parquet, and skip tests that require LZO compression." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
David Knupp created IMPALA-7088: --- Summary: Parallel data load breaks load-data.py if loading data on a real cluster Key: IMPALA-7088 URL: https://issues.apache.org/jira/browse/IMPALA-7088 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.0 Reporter: David Knupp Impala/bin/load-data.py is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual remote cluster. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6317) Expose -cmake_only flag to buildall.sh
[ https://issues.apache.org/jira/browse/IMPALA-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6317. - Resolution: Fixed > Expose -cmake_only flag to buildall.sh > -- > > Key: IMPALA-6317 > URL: https://issues.apache.org/jira/browse/IMPALA-6317 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.11.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Minor > > Impala/bin/make_impala.sh has a {{-cmake_only}} command line option: > {noformat} > -cmake_only) > CMAKE_ONLY=1 > {noformat} > Passing this flag means that makefiles only will be generated during the > build. However, this flag is not provided in buildall.sh (the caller of > make_impala.sh) which effectively renders it useless. > It turns out that if one has no intention of running the Impala cluster > locally (e.g., as when trying to build just enough of the toolchain and dev > environment to run the data load scripts for loading data onto a remote > cluster) then being able to only generate makefiles is a useful thing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6600) py.test error "Replacing crashed slave gw1" in test_spilling
[ https://issues.apache.org/jira/browse/IMPALA-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6600. - Resolution: Duplicate > py.test error "Replacing crashed slave gw1" in test_spilling > > > Key: IMPALA-6600 > URL: https://issues.apache.org/jira/browse/IMPALA-6600 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Lars Volker >Assignee: David Knupp >Priority: Major > Labels: broken-build, flaky > > I saw a build fail with "Replacing crashed slave gw1". Here's the failing > part of the log: > {noformat} > 12:18:33 [gw0] PASSED > query_test/test_tablesample.py::TestTableSample::test_tablesample[repeatable: > False | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > 12:18:33 [gw1] node down: Not properly terminated > 12:18:33 [gw1] FAILED > query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling[exec_option: > {'debug_action': None, 'default_spillable_buffer_size': '256k'} | > table_format: parquet/none] > 12:18:33 Replacing crashed slave gw1 > 12:18:34 > 12:18:34 unittests/test_file_parser.py::TestTestFileParser::test_valid_parse > {noformat} > Here is the summary: > {noformat} > 12:44:14 === FAILURES > === > 12:44:14 _ query_test/test_spilling.py > __ > 12:44:14 [gw1] linux2 -- Python 2.6.6 > /data/jenkins/workspace/impala-asf-master-core-local/repos/Impala/bin/../infra/python/env/bin/python > 12:44:14 Slave 'gw1' crashed while running > "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option: > {'debug_action': None, 'default_spillable_buffer_size': '256k'} | > table_format: parquet/none]" > 12:44:14 == 1 failed, 1494 passed, 404 skipped, 9 xfailed in 10374.68 > seconds === > {noformat} > [~dknupp] - I’m assigning this to you thinking you might have an idea what’s > going on here; feel free to find another person or assign back to me if > you're swamped. > I’ve seen this happen in a private Jenkins run. Please ping me if you would > like access to the build artifacts. > I've also seen a similar error message in IMPALA-5724 and in [this GRPC issue > on Github|https://github.com/grpc/grpc/issues/3577]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-7055) test_avro_writer failing on upstream Jenkins:
David Knupp created IMPALA-7055: --- Summary: test_avro_writer failing on upstream Jenkins: Key: IMPALA-7055 URL: https://issues.apache.org/jira/browse/IMPALA-7055 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.0 Reporter: David Knupp This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, but it is not related to that patch. The failing build is https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/. Test appears to be (from [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]): {noformat} QUERY SET ALLOW_UNSUPPORTED_FORMATS=0; insert into __avro_write select 1, "b", 2.2; CATCH Writing to table format AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS {noformat} Error output: {noformat} 01:50:18 ] FAIL query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 0} | table_format: text/none] 01:50:18 ] === FAILURES === 01:50:18 ] TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 0} | table_format: text/none] 01:50:18 ] [gw9] linux2 -- Python 2.7.12 /home/ubuntu/Impala/bin/../infra/python/env/bin/python 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector) 01:50:18 ] common/impala_test_suite.py:420: in run_test_case 01:50:18 ] assert False, "Expected exception: %s" % expected_str 01:50:18 ] E AssertionError: Expected exception: Writing to table format AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS 01:50:18 ] Captured stderr setup - 01:50:18 ] -- connecting to: localhost:21000 01:50:18 ] - Captured stderr call - 01:50:18 ] -- executing against localhost:21000 01:50:18 ] use functional; 01:50:18 ] 01:50:18 ] SET batch_size=0; 01:50:18 ] SET num_nodes=0; 01:50:18 ] SET disable_codegen_rows_threshold=5000; 01:50:18 ] SET disable_codegen=False; 01:50:18 ] SET abort_on_error=1; 01:50:18 ] SET exec_single_node_rows_threshold=0; 01:50:18 ] -- executing against localhost:21000 01:50:18 ] drop table if exists __avro_write; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=NONE; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] create table __avro_write (i int, s string, d double) 01:50:18 ] stored as AVRO 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{ 01:50:18 ] "name": "my_record", 01:50:18 ] "type": "record", 01:50:18 ] "fields": [ 01:50:18 ] {"name":"i", "type":["int", "null"]}, 01:50:18 ] {"name":"s", "type":["string", "null"]}, 01:50:18 ] {"name":"d", "type":["double", "null"]}]}'); 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=""; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=NONE; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] insert into __avro_write select 0, "a", 1.1; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=""; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0"; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=SNAPPY; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] insert into __avro_write select 1, "b", 2.2; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET COMPRESSION_CODEC=""; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0"; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] select * from __avro_write; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=0; 01:50:18 ] 01:50:18 ] -- executing against localhost:21000 01:50:18 ] 01:50:18 ] insert into __avro_write sel
[jira] [Resolved] (IMPALA-4464) Remove remote_data_load.py
[ https://issues.apache.org/jira/browse/IMPALA-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-4464. - Resolution: Fixed > Remove remote_data_load.py > -- > > Key: IMPALA-4464 > URL: https://issues.apache.org/jira/browse/IMPALA-4464 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.8.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Labels: remote_cluster_test > > A patch was recently submitted that allows data load and end-end tests to run > against a remote cluster. At its core was this file: > https://github.com/apache/incubator-impala/blob/master/bin/remote_data_load.py > However, while this script relies on several changes to existing build and > test scripts, nothing else in turns relies on it. In retrospect, it does not > make sense to have this script in the Impala repo if nothing can use it > externally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6938) Build failing because failed to assign HBase regions during data load
David Knupp created IMPALA-6938: --- Summary: Build failing because failed to assign HBase regions during data load Key: IMPALA-6938 URL: https://issues.apache.org/jira/browse/IMPALA-6938 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.1.0 Reporter: David Knupp 00:53:41 Splitting HBase (logging to /data/jenkins/workspace/impala-cdh6.x-core/repos/Impala/logs/data_loading/create-hbase.log)... 00:55:29 FAILED (Took: 1 min 48 sec) 00:55:29 '/data/jenkins/workspace/impala-cdh6.x-core/repos/Impala/testdata/bin/split-hbase.sh' failed. Tail of log: 00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: functional_hbase.alltypesagg,3,1524642858707.b0a6c361d408d230442311281afbefc8. 3 -> localhost:16202, expecting localhost,16202,1524639822962 [...] 00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: functional_hbase.alltypesagg,7,1524642862231.a7e1c97240f425f98cddaa1e9070651d. 7 -> localhost:16203, expecting localhost,16203,1524639824558 00:55:29 18/04/25 00:55:28 INFO datagenerator.HBaseTestDataRegionAssigment: functional_hbase.alltypesagg,9,1524642862231.4f6b1fc8c0104c6b7ef782dfa3d3d616. 9 -> localhost:16203, expecting localhost,16203,1524639824558 00:55:29 Exception in thread "main" java.lang.IllegalStateException: Failed to assign regions to servers after 6 millis. 00:55:29at org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.performAssigment(HBaseTestDataRegionAssigment.java:198) 00:55:29at org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:330) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6935) test_analytic_fns failed during exhaustive build on RHEL7: AnalysisException: Couldn't evaluate LEAD/LAG offset
David Knupp created IMPALA-6935: --- Summary: test_analytic_fns failed during exhaustive build on RHEL7: AnalysisException: Couldn't evaluate LEAD/LAG offset Key: IMPALA-6935 URL: https://issues.apache.org/jira/browse/IMPALA-6935 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.13.0 Reporter: David Knupp Stacktrace {noformat} query_test/test_queries.py:53: in test_analytic_fns self.run_test_case('QueryTest/analytic-fns', vector) common/impala_test_suite.py:398: in run_test_case result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:613: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:160: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:173: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:339: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:335: in execute_query_async return self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:460: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: IllegalStateException: Failed analysis after expr substitution. E CAUSED BY: AnalysisException: Couldn't evaluate LEAD/LAG offset: couldn't execute expr 87 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"
David Knupp created IMPALA-6933: --- Summary: test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists" Key: IMPALA-6933 URL: https://issues.apache.org/jira/browse/IMPALA-6933 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.13.0 Reporter: David Knupp Error Message {noformat} test setup failure {noformat} Stacktrace {noformat} conftest.py:347: in conn with __unique_conn(db_name=db_name, timeout=timeout) as conn: /usr/lib64/python2.6/contextlib.py:16: in __enter__ return self.gen.next() conftest.py:380: in __unique_conn cur.execute("CREATE DATABASE %s" % db_name) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in execute configuration=configuration) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in execute_async self._execute_async(op) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in _execute_async operation_fn() ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in op async=True) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: in execute return self._operation('ExecuteStatement', req) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in _operation resp = self._rpc(kind, request) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in _rpc err_if_rpc_not_ok(response) ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in err_if_rpc_not_ok raise HiveServer2Error(resp.status.errorMessage) E HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' RPC to Hive Metastore: E CAUSED BY: AlreadyExistsException: Database f0mraw already exists {noformat} Tests affected: * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table * query_test.test_kudu.TestCreateExternalTable.test_explicit_name * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning * query_test.test_kudu.TestCreateExternalTable.test_column_name_case * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6931) TestQueryExpiration.test_query_expiration fails on ASAN with unexpected number of expired queries
David Knupp created IMPALA-6931: --- Summary: TestQueryExpiration.test_query_expiration fails on ASAN with unexpected number of expired queries Key: IMPALA-6931 URL: https://issues.apache.org/jira/browse/IMPALA-6931 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Assignee: Vuk Ercegovac Stacktrace {noformat} custom_cluster/test_query_expiration.py:108: in test_query_expiration client.QUERY_STATES['EXCEPTION']) custom_cluster/test_query_expiration.py:184: in __expect_client_state assert expected_state == actual_state E assert 5 == 4 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6928) test_bloom_filters failing on ASAN build: did not find "Runtime Filter Published" in profile
David Knupp created IMPALA-6928: --- Summary: test_bloom_filters failing on ASAN build: did not find "Runtime Filter Published" in profile Key: IMPALA-6928 URL: https://issues.apache.org/jira/browse/IMPALA-6928 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Assignee: Thomas Tauber-Marshall Stacktrace {noformat} query_test/test_runtime_filters.py:81: in test_bloom_filters self.run_test_case('QueryTest/bloom_filters', vector) common/impala_test_suite.py:444: in run_test_case verify_runtime_profile(test_section['RUNTIME_PROFILE'], result.runtime_profile) common/test_result_verifier.py:560: in verify_runtime_profile actual)) E AssertionError: Did not find matches for lines in runtime profile: E EXPECTED LINES: E row_regex: .*1 of 1 Runtime Filter Published.* E E ACTUAL PROFILE: E Query (id=a64a18654d28e0c3:e6220f6c): E DEBUG MODE WARNING: Query profile created while running a DEBUG build of Impala. Use RELEASE builds to measure query performance. E Summary: E Session ID: 244e6109f4226b2b:39160855c64ad4a1 E Session Type: BEESWAX E Start Time: 2018-04-23 23:31:59.326883000 E End Time: E Query Type: QUERY E Query State: FINISHED E Query Status: OK E Impala Version: impalad version 2.12.0-cdh5.15.0 DEBUG (build 3d60947b813429cd1db59f9a342498982d341de9) E User: jenkins E Connected User: jenkins E Delegated User: E Network Address: 127.0.0.1:55776 E Default Db: functional E Sql Statement: with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem) E select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a E join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey E Coordinator: ec2-m2-4xlarge-centos-6-4-0f06.vpc.cloudera.com:22000 E Query Options (set by configuration): ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0 E Query Options (set by configuration and planner): ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,MT_DOP=0,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0 E Plan: E E Max Per-Host Resource Reservation: Memory=19.00MB E Per-Host Resource Estimates: Memory=557.00MB E E F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 E | Per-Host Resources: mem-estimate=28.00MB mem-reservation=18.00MB runtime-filters-memory=1.00MB E PLAN-ROOT SINK E | mem-estimate=0B mem-reservation=0B E | E 05:AGGREGATE [FINALIZE] E | output: count(*) E | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB E | tuple-ids=7 row-size=8B cardinality=1 E | E 04:HASH JOIN [INNER JOIN, BROADCAST] E | hash predicates: a.l_orderkey = -1 * l_orderkey E | fk/pk conjuncts: assumed fk/pk E | runtime filters: RF000[bloom] <- -1 * l_orderkey E | mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB E | tuple-ids=0,4 row-size=16B cardinality=1 E | E |--08:EXCHANGE [UNPARTITIONED] E | | mem-estimate=0B mem-reservation=0B E | | tuple-ids=4 row-size=8B cardinality=200 E | | E | F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 E | Per-Host Resources: mem-estimate=0B mem-reservation=0B E | 07:EXCHANGE [UNPARTITIONED] E | | limit: 200 E | | mem-estimate=0B mem-reservation=0B E | | tuple-ids=4 row-size=8B cardinality=200 E | | E | F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 E | Per-Host Resources: mem-estimate=264.00MB mem-reservation=0B E | 01:UNION E | | pass-through-operands: all E | | limit: 200 E | | mem-estimate=0B mem-reservation=0B E | | tuple-ids=4 row-size=8B cardinality=200 E | | E | |--03:SCAN HDFS [tpch.lineitem, RANDOM] E | | partitions=1/1 files=1 size=718.94MB E | | stored statistics: E | | table: rows=6001215 size=718.94MB E | | columns: all E | | extrapolated-rows=disabled E | | mem-estimate=264.00MB mem-reservation=0B E | | tuple-ids=3 row-size=8B cardinality=6001215 E | | E | 02:SCAN HDFS [tpch.lineitem, RANDOM] E | partitions=1/1 files=1 size=718.94MB E | stored statistics: E | table: rows=6001215 size=718.94MB E | columns: all E | extrapolated-rows=disabled E | mem-estimate=264.00MB mem-reservation=0B E | tuple-ids=2 row-size=8B cardinality=6001215 E | E 06:EXCHANGE [UNPARTITIONED] E | limit: 1 E | mem-estimate=0B mem-reservation=0B E | tuple-ids=0 row-size=8B cardinality=1 E | E F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 E Per-Host Resources: mem-estimate=265.
[jira] [Created] (IMPALA-6925) 'load-data functional-query exhaustive' failed: exception in load-data worker thread
David Knupp created IMPALA-6925: --- Summary: 'load-data functional-query exhaustive' failed: exception in load-data worker thread Key: IMPALA-6925 URL: https://issues.apache.org/jira/browse/IMPALA-6925 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.13.0 Reporter: David Knupp {noformat} 17:10:41 Error executing hive SQL: /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-seq-snap-record.sql See: /data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-seq-snap-record.sql.log Exception in thread Thread-6 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner File "/usr/lib64/python2.6/threading.py", line 484, in run File "/usr/lib64/python2.6/multiprocessing/pool.py", line 68, in worker File "/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-integration/repos/Impala/bin/load-data.py", line 146, in exec_hive_query_from_file_beeline : 'NoneType' object has no attribute 'info' {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6922) test_kudu_insert on exhaustive build
David Knupp created IMPALA-6922: --- Summary: test_kudu_insert on exhaustive build Key: IMPALA-6922 URL: https://issues.apache.org/jira/browse/IMPALA-6922 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: David Knupp Assignee: Thomas Tauber-Marshall Error Message {noformat} query_test/test_kudu.py:84: in test_kudu_insert self.run_test_case('QueryTest/kudu_insert', vector, use_db=unique_database) common/impala_test_suite.py:455: in run_test_case pytest.config.option.update_results, result_section='DML_RESULTS') common/test_result_verifier.py:404: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:231: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E 1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None E Number of rows returned (expected vs actual): 1 != 0 {noformat} Stacktrace {noformat} query_test/test_kudu.py:84: in test_kudu_insert self.run_test_case('QueryTest/kudu_insert', vector, use_db=unique_database) common/impala_test_suite.py:455: in run_test_case pytest.config.option.update_results, result_section='DML_RESULTS') common/test_result_verifier.py:404: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:231: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E 1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None E Number of rows returned (expected vs actual): 1 != 0 {noformat} Standard Error {noformat} SET sync_ddl=False; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_kudu_insert_70eff904` CASCADE; SET sync_ddl=False; -- executing against localhost:21000 CREATE DATABASE `test_kudu_insert_70eff904`; MainThread: Created database "test_kudu_insert_70eff904" for test ID "query_test/test_kudu.py::TestKuduOperations::()::test_kudu_insert[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none]" -- executing against localhost:21000 use test_kudu_insert_70eff904; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET exec_single_node_rows_threshold=0; -- executing against localhost:21000 create table tdata (id int primary key, valf float null, vali bigint null, valv string null, valb boolean null, valt tinyint null, vals smallint null, vald double null, ts timestamp, decimal4 decimal(9,9) null, decimal8 decimal(18,2) null, decimal16 decimal(38, 0) null) PARTITION BY RANGE (PARTITION VALUES < 10, PARTITION 10 <= VALUES < 30, PARTITION 30 <= VALUES) STORED AS KUDU; -- executing against localhost:21000 insert into tdata values (1, 1, 1, 'one', true, 1, 1, 1, cast('1987-05-19 00:00:00' as timestamp), 0.1, 1.00, 1); -- executing against localhost:21000 select * from tdata limit 1000; MainThread: Comparing QueryTestResults (expected vs actual): 1,1,1,'one',true,1,1,1,1987-05-19 00:00:00,0.1,1.00,1 != None Number of rows returned (expected vs actual): 1 != 0 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6921) AnalysisException: Failed to load metadata for table: 'tpch_kudu.ctas_cancel' during data load
David Knupp created IMPALA-6921: --- Summary: AnalysisException: Failed to load metadata for table: 'tpch_kudu.ctas_cancel' during data load Key: IMPALA-6921 URL: https://issues.apache.org/jira/browse/IMPALA-6921 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.13.0 Reporter: David Knupp This exception seems to be consistently thrown during the data load phase. It appears in compute-table-stats.log. {noformat} 2018-04-22 06:50:48,764 Thread-8: Failed on table tpch_kudu.ctas_cancel Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/util/compute_table_stats.py", line 40, in compute_stats_table result = impala_client.execute(statement) File "/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py", line 173, in execute handle = self.__execute_query(query_string.strip(), user=user) File "/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py", line 339, in __execute_query handle = self.execute_query_async(query_string, user=user) File "/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py", line 335, in execute_query_async return self.__do_rpc(lambda: self.imp_service.query(query,)) File "/data/jenkins/workspace/impala-cdh6.0.0_beta1-core/repos/Impala/tests/beeswax/impala_beeswax.py", line 460, in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: AnalysisException: Failed to load metadata for table: 'tpch_kudu.ctas_cancel' CAUSED BY: TableLoadingException: Error loading metadata for Kudu table impala::tpch_kudu.ctas_cancel CAUSED BY: ImpalaRuntimeException: Error opening Kudu table 'impala::tpch_kudu.ctas_cancel', Kudu error: The table does not exist: table_name: "impala::tpch_kudu.ctas_cancel" {noformat} ctas_cancel is a table that gets used by query_test/test_cancellation.py This doesn't seem to break anything (data laoding completes and tests pass), but it's vexing that we part of our standard data load process produces exceptions in any log file. Please feel free to mark this as invalid if this is not really an issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6914) test_mem_limit in test_admission_controller timed out waiting for query to end
David Knupp created IMPALA-6914: --- Summary: test_mem_limit in test_admission_controller timed out waiting for query to end Key: IMPALA-6914 URL: https://issues.apache.org/jira/browse/IMPALA-6914 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.0 Reporter: David Knupp Error Message {noformat} AssertionError: Timed out waiting 60 seconds for query end assert (1524166375.299542 - 1524166315.228723) < 60 + where 1524166375.299542 = time() {noformat} Stacktrace {noformat} custom_cluster/test_admission_controller.py:943: in test_mem_limit {'request_pool': self.pool_name, 'mem_limit': query_mem_limit}) custom_cluster/test_admission_controller.py:837: in run_admission_test self.end_admitted_queries(num_to_end) custom_cluster/test_admission_controller.py:622: in end_admitted_queries assert (time() - start_time < STRESS_TIMEOUT),\ E AssertionError: Timed out waiting 60 seconds for query end E assert (1524166375.299542 - 1524166315.228723) < 60 E+ where 1524166375.299542 = time() {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6911) test_duplicate_partitions failing on recent S3 build
David Knupp created IMPALA-6911: --- Summary: test_duplicate_partitions failing on recent S3 build Key: IMPALA-6911 URL: https://issues.apache.org/jira/browse/IMPALA-6911 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.0 Reporter: David Knupp {noformat} Stacktrace metadata/test_recover_partitions.py:255: in test_duplicate_partitions assert old_length + 1 == len(result.data),\ E AssertionError: ALTER TABLE test_duplicate_partitions_4b6ed438.test_recover_partitions RECOVER PARTITIONS failed to handle duplicate partition key values. E assert (3 + 1) == 3 E+ where 3 = len(['1\tp1\t-1\t1\t2B\tNOT CACHED\tNOT CACHED\tTEXT\tfalse\ts3a://impala-cdh5-s3-test/test-warehouse/test_duplicate_parti...st-warehouse/test_duplicate_partitions_4b6ed438.db/test_recover_partitions/i=1/p=p4', 'Total\t\t-1\t1\t2B\t0B\t\t\t\t']) E+where ['1\tp1\t-1\t1\t2B\tNOT CACHED\tNOT CACHED\tTEXT\tfalse\ts3a://impala-cdh5-s3-test/test-warehouse/test_duplicate_parti...st-warehouse/test_duplicate_partitions_4b6ed438.db/test_recover_partitions/i=1/p=p4', 'Total\t\t-1\t1\t2B\t0B\t\t\t\t'] = .data {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6910) test_seq_writer (in test_compressed_formats) failed on S3 build: "SdkClientException: Data read has a different length than the expected"
David Knupp created IMPALA-6910: --- Summary: test_seq_writer (in test_compressed_formats) failed on S3 build: "SdkClientException: Data read has a different length than the expected" Key: IMPALA-6910 URL: https://issues.apache.org/jira/browse/IMPALA-6910 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.0 Reporter: David Knupp Assignee: Sailesh Mukil {noformat} Stacktrace query_test/test_compressed_formats.py:149: in test_seq_writer self.run_test_case('QueryTest/seq-writer', vector, unique_database) common/impala_test_suite.py:397: in run_test_case result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:612: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:160: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:173: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:341: in __execute_query self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:Disk I/O error: Error reading from HDFS file: s3a://impala-cdh5-s3-test/test-warehouse/tpcds.store_sales_parquet/ss_sold_date_sk=2452585/a5482dcb946b6c98-7543e0dd0004_95929617_data.0.parq E Error(255): Unknown error 255 E Root cause: SdkClientException: Data read has a different length than the expected: dataLength=8576; expectedLength=17785; includeSkipped=true; in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6906) test_admission_controller.TestAdmissionController.test_memory_rejection on S3
David Knupp created IMPALA-6906: --- Summary: test_admission_controller.TestAdmissionController.test_memory_rejection on S3 Key: IMPALA-6906 URL: https://issues.apache.org/jira/browse/IMPALA-6906 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 2.13.0, Impala 3.1.0 Reporter: David Knupp Assignee: Tim Armstrong {noformat} Stacktrace custom_cluster/test_admission_controller.py:402: in test_memory_rejection self.run_test_case('QueryTest/admission-reject-mem-estimate', vector) common/impala_test_suite.py:444: in run_test_case verify_runtime_profile(test_section['RUNTIME_PROFILE'], result.runtime_profile) common/test_result_verifier.py:560: in verify_runtime_profile actual)) E AssertionError: Did not find matches for lines in runtime profile: E EXPECTED LINES: E row_regex: .*Per-Host Resource Estimates: Memory=90.00MB.* E E ACTUAL PROFILE: E Query (id=9f4cdd224745b688:b89cfdc3): E DEBUG MODE WARNING: Query profile created while running a DEBUG build of Impala. Use RELEASE builds to measure query performance. E Summary: E Session ID: 2e4d7150362474cb:8dcbaecea87bb80 E Session Type: BEESWAX E Start Time: 2018-04-21 02:01:35.027023000 E End Time: E Query Type: QUERY E Query State: FINISHED E Query Status: OK E Impala Version: impalad version 3.0.0-SNAPSHOT DEBUG (build b68e06997c1f49f6b723d78e217efddec4f56f3a) E User: jenkins E Connected User: jenkins E Delegated User: E Network Address: 127.0.0.1:33892 E Default Db: functional E Sql Statement: select min(l_comment) from tpch_parquet.lineitem E Coordinator: E Query Options (set by configuration): ABORT_ON_ERROR=1,NUM_NODES=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,DISABLE_CODEGEN_ROWS_THRESHOLD=5000,MAX_MEM_ESTIMATE_FOR_ADMISSION=10485760 E Query Options (set by configuration and planner): ABORT_ON_ERROR=1,NUM_NODES=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,MT_DOP=0,DISABLE_CODEGEN_ROWS_THRESHOLD=5000,MAX_MEM_ESTIMATE_FOR_ADMISSION=10485760 E Plan: E E Max Per-Host Resource Reservation: Memory=0B E Per-Host Resource Estimates: Memory=50.00MB {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6902) query_test.test_udfs.TestUdfExecution.test_native_functions_race failed during core/thrift build
David Knupp created IMPALA-6902: --- Summary: query_test.test_udfs.TestUdfExecution.test_native_functions_race failed during core/thrift build Key: IMPALA-6902 URL: https://issues.apache.org/jira/browse/IMPALA-6902 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Assignee: Vuk Ercegovac Assigning to Vuk, who authored this test as part of the patch for IMPALA-6488: https://gerrit.cloudera.org/c/9626/ I'm not sure that this really the same failure though, so I'm not reopening that earlier JIRA. If I'm mistaken, please feel free to reopen/reassign as necessary. Stacktrace {noformat} query_test/test_udfs.py:377: in test_native_functions_race assert len(errors) == 0 E assert 1 == 0 E + where 1 = len([ImpalaBeeswaxException()]) Standard Output ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: ImpalaRuntimeException: Error making 'alterDatabase' RPC to Hive Metastore: CAUSED BY: NoSuchObjectException: test_native_functions_race_fc9680e5: Transaction rolled back due to failure during commit{noformat} Standard Error {noformat} SET sync_ddl=False; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_native_functions_race_fc9680e5` CASCADE; SET sync_ddl=False; -- executing against localhost:21000 CREATE DATABASE `test_native_functions_race_fc9680e5`; MainThread: Created database "test_native_functions_race_fc9680e5" for test ID "query_test/test_udfs.py::TestUdfExecution::()::test_native_functions_race[exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': True} | table_format: text/none]" -- connecting to: localhost:21000 -- executing against localhost:21000 create function test_native_functions_race_fc9680e5.use_it(string) returns string LOCATION '/test-warehouse/libTestUdfs.so' SYMBOL='_Z8IdentityPN10impala_udf15FunctionContextERKNS_9StringValE'; {noformat} >From catalogd log: {noformat} I0420 14:54:00.014191 19585 jni-util.cc:230] org.apache.impala.common.ImpalaRuntimeException: Error making 'alterDatabase' RPC to Hive Metastore: at org.apache.impala.service.CatalogOpExecutor.applyAlterDatabase(CatalogOpExecutor.java:2770) at org.apache.impala.service.CatalogOpExecutor.dropFunction(CatalogOpExecutor.java:1521) at org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:307) at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146) Caused by: NoSuchObjectException(message:test_native_functions_race_fc9680e5: Transaction rolled back due to failure during commit) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result$alter_database_resultStandardScheme.read(ThriftHiveMetastore.java:20111) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result$alter_database_resultStandardScheme.read(ThriftHiveMetastore.java:20088) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_database_result.read(ThriftHiveMetastore.java:20030) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_database(ThriftHiveMetastore.java:814) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_database(ThriftHiveMetastore.java:800) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alterDatabase(HiveMetaStoreClient.java:1420) at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:101) at com.sun.proxy.$Proxy5.alterDatabase(Unknown Source) at org.apache.impala.service.CatalogOpExecutor.applyAlterDatabase(CatalogOpExecutor.java:2768) ... 3 more I0420 14:54:01.581200 3230 catalog-server.cc:245] A catalog update with 5 entries is assembled. Catalog version: 14385 Last sent catalog version: 14377 I0420 14:54:01.582358 3225 catalog-server.cc:480] Collected deletion: FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, function_name:other)(other(FLOAT)), v ersion=14387, original size=310, compressed size=265 I0420 14:54:01.582433 3225 catalog-server.cc:480] Collected deletion: FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, function_name:other)(other(FLOAT)), v ersion=14389, original size=310, compressed size=265 I0420 14:54:01.582499 3225 catalog-server.cc:480] Collected deletion: FUNCTION:TFunctionName(db_name:test_native_functions_race_fc9680e5, function_name:other)(other(FLOAT)), v ersion=14391, original size=310, compressed size=265 I0420 14:54:01.582567 3225 catalog-server.cc:480] Colle
[jira] [Resolved] (IMPALA-6761) delimited-text-parser-test fails in ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6761. - Resolution: Fixed Fix Version/s: Impala 2.13.0 > delimited-text-parser-test fails in ASAN build > -- > > Key: IMPALA-6761 > URL: https://issues.apache.org/jira/browse/IMPALA-6761 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: Michael Ho >Assignee: Zach Amsden >Priority: Blocker > Labels: broken-build > Fix For: Impala 2.13.0 > > > Hi [~zamsden], could this be related to your recent change to fix IMPALA-6389 > ? > {noformat} > 03:26:07 [ RUN ] DelimitedTextParser.SpecialDelimiters > 03:26:07 = > 03:26:07 ==14342==ERROR: AddressSanitizer: stack-buffer-overflow on address > 0x7fff33da29c1 at pc 0x0141f344 bp 0x7fff33da1d20 sp 0x7fff33da1d18 > 03:26:07 READ of size 1 at 0x7fff33da29c1 thread T0 > 03:26:07 #0 0x141f343 in > impala::DelimitedTextParser::ReturnCurrentColumn() const > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.h:114:39 > 03:26:07 #1 0x141bf49 in impala::Status > impala::DelimitedTextParser::AddColumn(long, char**, int*, > impala::FieldLocation*) > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.inline.h:62:7 > 03:26:07 #2 0x1419517 in > impala::DelimitedTextParser::ParseFieldLocations(int, long, char**, > char**, impala::FieldLocation*, int*, int*, char**) > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser.cc:194:43 > 03:26:07 #3 0x13f8ed7 in > impala::Validate(impala::DelimitedTextParser*, std::string const&, int, > char, int, int) > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:57:15 > 03:26:07 #4 0x13fb274 in > impala::DelimitedTextParser_SpecialDelimiters_Test::TestBody() > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:211:3 > 03:26:07 #5 0x3f3fc52 in void > testing::internal::HandleExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*) > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f3fc52) > 03:26:07 #6 0x3f375a9 in testing::Test::Run() > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f375a9) > 03:26:07 #7 0x3f376f7 in testing::TestInfo::Run() > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f376f7) > 03:26:07 #8 0x3f377d4 in testing::TestCase::Run() > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f377d4) > 03:26:07 #9 0x3f38a57 in testing::internal::UnitTestImpl::RunAllTests() > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f38a57) > 03:26:07 #10 0x3f38d32 in testing::UnitTest::Run() > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x3f38d32) > 03:26:07 #11 0x13fb927 in main > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:221:192 > 03:26:07 #12 0x7fdc3ec02cdc in __libc_start_main > (/lib64/libc.so.6+0x1ecdc) > 03:26:07 #13 0x13064a0 in _start > (/data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/build/debug/exec/delimited-text-parser-test+0x13064a0) > 03:26:07 > 03:26:07 Address 0x7fff33da29c1 is located in stack of thread T0 at offset 33 > in frame > 03:26:07 #0 0x13fa74f in > impala::DelimitedTextParser_SpecialDelimiters_Test::TestBody() > /data/jenkins/workspace/impala-asf-2.x-core-asan/repos/Impala/be/src/exec/delimited-text-parser-test.cc:149 > 03:26:07 > 03:26:07 This frame has 56 object(s): > 03:26:07 [32, 33) 'is_materialized_col' <== Memory access at offset 33 > overflows this variable > 03:26:07 [48, 208) 'tuple_delim_parser' > 03:26:07 [272, 432) 'nul_delim_parser' > 03:26:07 [496, 656) 'nul_field_parser' > 03:26:07 [720, 728) 'ref.tmp' > 03:26:07 [752, 753) 'ref.tmp4' > 03:26:07 [768, 776) 'ref.tmp5' > 03:26:07 [800, 801) 'ref.tmp6' > 03:26:07 [816, 824) 'ref.tmp7' > 03:26:07 [848, 849) 'ref.tmp8' > 03:26:07 [864, 872) 'ref.tmp9' > 03:26:07 [896, 897) 'ref.tmp10' > 03:26:07 [912, 920) 'ref.tmp11' > 03:26:07 [944, 945) 'ref.tmp12' > 03:26:07 [960, 968) 'ref.tmp13' > 03:26:07
[jira] [Created] (IMPALA-6814) query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters
David Knupp created IMPALA-6814: --- Summary: query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters Key: IMPALA-6814 URL: https://issues.apache.org/jira/browse/IMPALA-6814 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.12.0 Reporter: David Knupp It looks like {{localhost}} is hardcoded in the test verification. h3. Stacktrace query_test/test_queries.py:161: in test_strict_mode self.run_test_case('QueryTest/strict-mode', vector) common/impala_test_suite.py:427: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:300: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:317: in verify_raw_results verify_errors(expected_errors, actual_errors) common/test_result_verifier.py:274: in verify_errors VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual) common/test_result_verifier.py:231: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): [...] E row_regex: .*Error parsing row: file: hdfs://*{color:#FF}localhost{color}*:20500/.* before offset: \d+ != 'Error parsing row: file: hdfs://impala-ubuntu1404-cluster-1.vpc.cloudera.com:8020/test-warehouse/overflow/overflow.txt, before offset: 454' E row_regex: .*Error parsing row: file: hdfs://*{color:#FF}localhost{color}*:20500/.* before offset: \d+ != 'Error parsing row: file: hdfs://impala-ubuntu1404-cluster-1.vpc.cloudera.com:8020/test-warehouse/overflow/overflow.txt, before offset: 454' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6810) query_test::test_runtime_filters.py::test_row_filters fails when run against an external cluster
David Knupp created IMPALA-6810: --- Summary: query_test::test_runtime_filters.py::test_row_filters fails when run against an external cluster Key: IMPALA-6810 URL: https://issues.apache.org/jira/browse/IMPALA-6810 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.12.0 Reporter: David Knupp Assignee: Alexander Behm Presumably this test has been passing when run against the local mini-cluster. When run against an external cluster, however, the test fails with an AssertionError because the exception string is different than expected. The expected string is: _ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: Rejected query from pool {color:red}default-pool{color}: minimum memory reservation is greater than memory available to the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. See the query profile for more information about the per-node memory requirements._ The actual string is: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: Rejected query from pool {color:red}root.jenkins{color}: minimum memory reservation is greater than memory available to the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. See the query profile for more information about the per-node memory requirements. {noformat} Stacktrace query_test/test_runtime_filters.py:168: in test_row_filters test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) common/impala_test_suite.py:401: in run_test_case self.__verify_exceptions(test_section['CATCH'], str(e), use_db) common/impala_test_suite.py:279: in __verify_exceptions (expected_str, actual_str) E AssertionError: Unexpected exception string. Expected: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: Rejected query from pool default-pool: minimum memory reservation is greater than memory available to the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. See the query profile for more information about the per-node memory requirements. E Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: Rejected query from pool root.jenkins: minimum memory reservation is greater than memory available to the query for buffer reservations. Increase the buffer_pool_limit to 290.00 MB. See the query profile for more information about the per-node memory requirements. {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6753) Update external Hadoop ecosystem versions
[ https://issues.apache.org/jira/browse/IMPALA-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6753. - Resolution: Fixed > Update external Hadoop ecosystem versions > - > > Key: IMPALA-6753 > URL: https://issues.apache.org/jira/browse/IMPALA-6753 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: David Knupp >Priority: Major > > Analogous to IMPALA-6272 > Updating the external Hadoop components on the mini-cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6808) Impala python code should be installed into infra/python/env as packages
David Knupp created IMPALA-6808: --- Summary: Impala python code should be installed into infra/python/env as packages Key: IMPALA-6808 URL: https://issues.apache.org/jira/browse/IMPALA-6808 Project: IMPALA Issue Type: Improvement Components: Clients, Infrastructure Affects Versions: Impala 3.0, Impala 2.12.0 Reporter: David Knupp Assignee: David Knupp Impala/infra/python/env is the environment where necessary upstream python libraries and packages get installed -- e.g., the packages listed in https://github.com/apache/impala/blob/master/infra/python/deps/requirements.txt and other similar files. Impala's own internal python code (like the impala-shell, or the common test libraries that we rely upon) should be made available the same way -- as actual packages installed into the environment -- rather than resroting to PYTHONPATH/sys.path.append(foo) sleight-of-hand, performed by such as bin/set-pythonpath.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6753) Update external Hadoop ecosystem versions
David Knupp created IMPALA-6753: --- Summary: Update external Hadoop ecosystem versions Key: IMPALA-6753 URL: https://issues.apache.org/jira/browse/IMPALA-6753 Project: IMPALA Issue Type: Improvement Components: Infrastructure Reporter: David Knupp Analogous to IMPALA-6272 Updating the external Hadoop components on the mini-cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IMPALA-6716) ImpalaShell should not rely on global access to parsed command line options
[ https://issues.apache.org/jira/browse/IMPALA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp closed IMPALA-6716. --- Resolution: Fixed Fix Version/s: Impala 3.0 > ImpalaShell should not rely on global access to parsed command line options > --- > > Key: IMPALA-6716 > URL: https://issues.apache.org/jira/browse/IMPALA-6716 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Major > Fix For: Impala 3.0 > > > A recent patch to address a problem line breaks in LDAP passwords > (IMPALA-6610) can, in rare instances (e.g., when running the shell as an > installed python package), result in an exception being thrown if the call to > {{_connect()}} fails. > {noformat} > $ impala-shell -i foo > Starting Impala Shell without Kerberos authentication > Traceback (most recent call last): > File "/home/systest/shellenv/bin/impala-shell", line 11, in > load_entry_point('impala-shell', 'console_scripts', 'impala-shell')() > File > "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", > line 1588, in main > shell = ImpalaShell(options, query_options) > File > "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", > line 209, in __init__ > self.do_connect(options.impalad) > File > "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", > line 755, in do_connect > self._connect() > File > "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", > line 821, in _connect > if options.ldap_password_cmd and \ > NameError: global name 'options' is not defined > {noformat} > The error is actually in the connection failure handling code: > https://github.com/apache/impala/blob/master/shell/impala_shell.py#L821 > The problem is that the shell instance should not assume continued access to > the options returned from {{parser.parse_args().}} In most cases, we store > those values directly as member variables of the shell. We should do the same > with all LDAP-related values, and then access those member variables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6716) ImpalaShell should not rely on global access to parsed command line options
David Knupp created IMPALA-6716: --- Summary: ImpalaShell should not rely on global access to parsed command line options Key: IMPALA-6716 URL: https://issues.apache.org/jira/browse/IMPALA-6716 Project: IMPALA Issue Type: Bug Components: Clients Affects Versions: Impala 3.0, Impala 2.12.0 Reporter: David Knupp A recent patch to address a problem line breaks in LDAP passwords (IMPALA-6610) can, in rare instances, result in an exception being thrown if the call to {{_connect()}} fails. {noformat} $ impala-shell -i foo Starting Impala Shell without Kerberos authentication Traceback (most recent call last): File "/home/systest/shellenv/bin/impala-shell", line 11, in load_entry_point('impala-shell', 'console_scripts', 'impala-shell')() File "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", line 1588, in main shell = ImpalaShell(options, query_options) File "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", line 209, in __init__ self.do_connect(options.impalad) File "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", line 755, in do_connect self._connect() File "/home/systest/Impala/shell/packaging/staging/impala_shell/impala_shell.py", line 821, in _connect if options.ldap_password_cmd and \ NameError: global name 'options' is not defined {noformat} The error is actually in the connection failure handling code. The problem is that the shell instance should not assume continued access to the options returned from {{parser.parse_args().}} In most cases, we store those values directly as member variables of the shell. We should do the same with all LDAP-related values, and then access those member variables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6702) Consider using standard pip client to download python dependencies
David Knupp created IMPALA-6702: --- Summary: Consider using standard pip client to download python dependencies Key: IMPALA-6702 URL: https://issues.apache.org/jira/browse/IMPALA-6702 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 3.0, Impala 2.12.0 Reporter: David Knupp Impala currently uses a hand-rolled client to download python dependencies: [https://github.com/apache/impala/blob/master/infra/python/deps/pip_download.py] This client skips the install step, adds automatic retries, and avoids trying to use wheel packages. However, the standard pip client does all of these things as well. Sometimes, upstream changes to PyPI can cause this custom client to break, most recently in IMPALA-6682 and IMPALA-6695. Perhaps Impala should consider dropping pip_download.py in favor of using the public pip client. (Kudu-python presents a problem though – see below.) A quick test did show that there were some minor differences. Using pip_download.py vs. pip v9.0.2 to process the various requirements.txt files at [https://github.com/apache/impala/tree/master/infra/python/deps]. The pip command used was: {noformat} $ pip download --dest=$IMPALA_HOME/infra/python/deps --no-binary=:all: --no-deps -r $IMPALA_HOME/infra/python/deps/*requirements.txt {noformat} For requirements.txt: * pip v9.0.2 ** Ignores readline: markers 'sys_platform == "darwin"' don't match your environment ** Downloads prettytable-0.7.2.zip ** Downloads pyparsing-2.0.3.zip * pip_download.py ** Downloads readline-6.2.4.1.tar.gz ** Downloads prettytable-0.7.2.tar.bz2 ** Downloads pyparsing-2.0.3.tar.gz For compiled-requirements.txt * pip v9.0.2 ** Downloads Cython-0.23.4.zip ** Downloads numpy-1.10.4.zip * pip_download.py ** Downloads Cython-0.23.4.tar.gz ** Downloads numpy-1.10.4.tar.gz For adls-requirements.txt * no difference Unfortunately, the kudu-requirements.txt, which only contains one dependency ({{kudu-python==1.2.0}}), is problematic. Even when using the {{download}} command with pip, setup.py tries to install the package: {noformat} Using cached kudu-python-1.2.0.tar.gz Saved ./kudu-python-1.2.0.tar.gz Complete output from command python setup.py egg_info: Cannot find installed kudu client. Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-N1Mu7y/kudu-python/ {noformat} We would need to either figure out why this happening (maybe it's a bug in the Kudu setup.py file), or else find a workaround for this one package. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6570) Remote cluster test fails on SLES12/SP3 -- "Kudu not supported on this operating system"
David Knupp created IMPALA-6570: --- Summary: Remote cluster test fails on SLES12/SP3 -- "Kudu not supported on this operating system" Key: IMPALA-6570 URL: https://issues.apache.org/jira/browse/IMPALA-6570 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Assignee: Thomas Tauber-Marshall The cluster has Kudu installed. When trying to load the standard data for the Impala functional test suite onto the cluster, I see: {noformat} INSERT INTO TABLE tpch_kudu.lineitem SELECT * FROM tpch.lineitem Data Loading from Impala failed with error: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: [Errno 104] Connection reset by peer Traceback (most recent call last): File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/bin/load-data.py", line 179, in exec_impala_query_from_file result = impala_client.execute(query) File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py", line 173, in execute handle = self.__execute_query(query_string.strip(), user=user) File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py", line 341, in __execute_query self.wait_for_completion(handle) File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py", line 353, in wait_for_completion query_state = self.get_state(query_handle) File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py", line 370, in get_state return self.__do_rpc(lambda: self.imp_service.get_state(query_handle)) File "/data/jenkins/workspace/impala_private_remote_cluster_tests/impala_remote_cluster_tests/Impala-local/tests/beeswax/impala_beeswax.py", line 467, in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(u), u) ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: [Errno 104] Connection reset by peer {noformat} In /var/log/impalad/impalad.INFO, I see: {noformat} I0222 21:47:03.208997 9522 init.cc:241] Process ID: 9522 I0222 21:47:04.432262 9522 status.cc:53] Kudu is not supported on this operating system. @ 0x961162 impala::Status::Status() @ 0x9645c5 impala::CheckKuduAvailability() @ 0x964602 impala::KuduIsAvailable() @ 0x95c761 impala::InitCommonRuntime() @ 0xbce0b3 ImpaladMain() @ 0x8e8523 main @ 0x7f70fdce96e5 __libc_start_main @ 0x92f529 _start {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6486) INVALIDATE METADATA may hang after statestore restart
[ https://issues.apache.org/jira/browse/IMPALA-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6486. - Resolution: Fixed Fix Version/s: Impala 2.12.0 Confirmed via manual test that the patch was successful. > INVALIDATE METADATA may hang after statestore restart > - > > Key: IMPALA-6486 > URL: https://issues.apache.org/jira/browse/IMPALA-6486 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.11.0 >Reporter: Dimitris Tsirogiannis >Assignee: Dimitris Tsirogiannis >Priority: Blocker > Labels: hang > Fix For: Impala 2.12.0 > > > In some cases, INVALIDATE METADATA may hang if it is run after a statestore > restart. This was caused by the fix for IMPALA-5058. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6350) "split-hbase.sh failed during data load for CDH5.14.0 exhaustive"
David Knupp created IMPALA-6350: --- Summary: "split-hbase.sh failed during data load for CDH5.14.0 exhaustive" Key: IMPALA-6350 URL: https://issues.apache.org/jira/browse/IMPALA-6350 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 2.11.0 Reporter: David Knupp This affects test setup, but not the product. {noformat} 21:24:39 Splitting HBase (logging to /data/jenkins/workspace/impala-cdh5-2.11.0_5.14.0-exhaustive/repos/Impala/logs/data_loading/create-hbase.log)... 21:27:32 FAILED (Took: 2 min 53 sec) 21:27:32 '/data/jenkins/workspace/impala-cdh5-2.11.0_5.14.0-exhaustive/repos/Impala/testdata/bin/split-hbase.sh' failed. Tail of log: 21:27:32at org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2733) 21:27:32at org.apache.hadoop.hbase.client.HBaseAdmin.splitRegion(HBaseAdmin.java:2693) 21:27:32at org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2714) 21:27:32at org.apache.hadoop.hbase.client.HBaseAdmin.split(HBaseAdmin.java:2703) 21:27:32at org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.performAssigment(HBaseTestDataRegionAssigment.java:112) 21:27:32at org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:312) 21:27:32 Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException): org.apache.hadoop.hbase.NotServingRegionException: Region functional_hbase.alltypessmall,1,1513574812814.1d8e718f14ccea2c2f5c6724f023558d. is not online on localhost,16201,1513565166378 21:27:32at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2997) 21:27:32at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1055) 21:27:32at org.apache.hadoop.hbase.regionserver.RSRpcServices.splitRegion(RSRpcServices.java:1853) 21:27:32at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22247) 21:27:32at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2191) 21:27:32at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) 21:27:32at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:183) 21:27:32at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:163) 21:27:32 21:27:32at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272) 21:27:32at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227) 21:27:32at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336) 21:27:32at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.splitRegion(AdminProtos.java:23173) 21:27:32at org.apache.hadoop.hbase.protobuf.ProtobufUtil.split(ProtobufUtil.java:1908) 21:27:32... 6 more {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (IMPALA-4641) Loading tpch nested test data to a remote cluster silently fails
[ https://issues.apache.org/jira/browse/IMPALA-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp closed IMPALA-4641. --- Resolution: Cannot Reproduce > Loading tpch nested test data to a remote cluster silently fails > > > Key: IMPALA-4641 > URL: https://issues.apache.org/jira/browse/IMPALA-4641 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.7.0 >Reporter: David Knupp >Priority: Critical > Labels: test-infra > > Running the Impala data load scripts doesn't always produce the same results > on a remote cluster as on the local mini-cluster. In this case, > {{tpch_nested_parquet}} data is never loaded. > {noformat} > [impala-debian78-test-cluster-4.vpc.cloudera.com:21000] > show table stats > tpch_nested_parquet.supplier; > Query: show table stats tpch_nested_parquet.supplier > +---++--+--+---+-+---+-+ > | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | > Incremental stats | Location > | > +---++--+--+---+-+---+-+ > | 0 | 1 | 356B | NOT CACHED | NOT CACHED| PARQUET | false >| > hdfs://impala-debian78-test-cluster-1.vpc.cloudera.com:8020/user/hive/warehouse/tpch_nested_parquet.db/supplier > | > +---++--+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {noformat} > Compare this to the local minicluster, after running data load. > {noformat} > [localhost.localdomain:21000] > show table stats tpch_nested_parquet.supplier; > Query: show table stats tpch_nested_parquet.supplier > +---++-+--+---+-+---+---+ > | #Rows | #Files | Size| Bytes Cached | Cache Replication | Format | > Incremental stats | Location > | > +---++-+--+---+-+---+---+ > | 1 | 1 | 43.00MB | NOT CACHED | NOT CACHED| PARQUET | > false | > hdfs://localhost:20500/test-warehouse/tpch_nested_parquet.db/supplier | > +---++-+--+---+-+---+---+ > Fetched 1 row(s) in 4.90s > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-6341) Cascade of test failures during ASAN build, probably related to Rpc exception: N6apache6thrift9transport19TTransportExceptionE
David Knupp created IMPALA-6341: --- Summary: Cascade of test failures during ASAN build, probably related to Rpc exception: N6apache6thrift9transport19TTransportExceptionE Key: IMPALA-6341 URL: https://issues.apache.org/jira/browse/IMPALA-6341 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Priority: Critical Several build failures were seen during a recent ASAN test run, probably stemming from initial error: {noformat} query_test/test_queries.py:62: in test_analytic_fns self.run_test_case('QueryTest/analytic-fns', vector) common/impala_test_suite.py:395: in run_test_case result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:610: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:160: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:173: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:339: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:335: in execute_query_async return self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:460: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: ExecQueryFInstances rpc query_id=834130b14e576a27:6b6389e0 failed: RPC Error: Client for ec2-m2-4xlarge-centos-6-4-1d24.vpc.cloudera.com:22002 hit an unexpected exception: ECONNRESET, type: N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala26TExecQueryFInstancesResultE, send: done {noformat} Possibly the same as IMPALA-5692 or IMPALA-5999? Not sure because there was no minidump this time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-6338) test_profile_fragment_instances failing on Isilon build
David Knupp created IMPALA-6338: --- Summary: test_profile_fragment_instances failing on Isilon build Key: IMPALA-6338 URL: https://issues.apache.org/jira/browse/IMPALA-6338 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: David Knupp Priority: Critical Stack trace: {noformat} query_test/test_observability.py:123: in test_profile_fragment_instances assert results.runtime_profile.count("HDFS_SCAN_NODE") == 12 E assert 11 == 12 E+ where 11 = ('HDFS_SCAN_NODE') E+where = 'Query (id=ae4cee91aafc5c6c:11b545c6):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...ontextSwitches: 0 (0)\n - TotalRawHdfsReadTime(*): 5s784ms\n - TotalReadThroughput: 17.33 MB/sec\n'.count E+ where 'Query (id=ae4cee91aafc5c6c:11b545c6):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...ontextSwitches: 0 (0)\n - TotalRawHdfsReadTime(*): 5s784ms\n - TotalReadThroughput: 17.33 MB/sec\n' = .runtime_profile {noformat} Query: {noformat} with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem) select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey; {noformat} Summary: {noformat} Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail 05:AGGREGATE 1 0.000ns 0.000ns 1 1 28.00 KB 10.00 MB FINALIZE 04:HASH JOIN 1 15.000ms 15.000ms 0 1 141.06 MB 17.00 MB INNER JOIN, BROADCAST |--08:EXCHANGE1 4s153ms 4s153ms 2.00M 2.00M 0 0 UNPARTITIONED | 07:EXCHANGE1 3s783ms 3s783ms 2.00M 2.00M 0 0 UNPARTITIONED | 01:UNION 3 17.000ms 28.001ms 3.03M 2.00M 0 0 | |--03:SCAN HDFS3 0.000ns 0.000ns 0 6.00M 0 176.00 MB tpch.lineitem | 02:SCAN HDFS 3 6s133ms 6s948ms 3.03M 6.00M 24.02 MB 176.00 MB tpch.lineitem 06:EXCHANGE 1 5s655ms 5s655ms 1 1 0 0 UNPARTITIONED 00:SCAN HDFS 3 4s077ms 6s207ms 2 1 16.05 MB 176.00 MB tpch.lineitem a {noformat} Plan: {noformat} Max Per-Host Resource Reservation: Memory=17.00MB Per-Host Resource Estimates: Memory=379.00MB F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=27.00MB mem-reservation=17.00MB PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 05:AGGREGATE [FINALIZE] | output: count(*) | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB | tuple-ids=7 row-size=8B cardinality=1 | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: a.l_orderkey = -1 * l_orderkey | fk/pk conjuncts: assumed fk/pk | runtime filters: RF000[bloom] <- -1 * l_orderkey | mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB | tuple-ids=0,4 row-size=16B cardinality=1 | |--08:EXCHANGE [UNPARTITIONED] | | mem-estimate=0B mem-reservation=0B | | tuple-ids=4 row-size=8B cardinality=200 | | | F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=0B mem-reservation=0B | 07:EXCHANGE [UNPARTITIONED] | | limit: 200 | | mem-estimate=0B mem-reservation=0B | | tuple-ids=4 row-size=8B cardinality=200 | | | F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 | Per-Host Resources: mem-estimate=176.00MB mem-reservation=0B | 01:UNION | | pass-through-operands: all | | limit: 200 | | mem-estimate=0B mem-reservation=0B | | tuple-ids=4 row-size=8B cardinality=200 | | | |--03:SCAN HDFS [tpch.lineitem, RANDOM] | | partitions=1/1 files=1 size=718.94MB | | stored statistics: | | table: rows=6001215 size=718.94MB | | columns: all | | extrapolated-rows=disabled | | mem-estimate=176.00MB mem-reservation=0B | | tuple-ids=3 row-size=8B cardinality=6001215 | | | 02:SCAN HDFS [tpch.lineitem, RANDOM] | partitions=1/1 files=1 size=718.94MB | stored statistics: | table: rows=6001215 size=718.94MB | columns: all | extrapolated-rows=disabled | mem-estimate=176.00MB mem-reservation=0B | tuple-ids=2 row-size=8B cardinality=6001215 | 06:EXCHANGE [UNPARTITIONED] | limit: 1 | mem-estimate=0B mem-reservation=0B | tuple-ids=0 row-size=8B cardinality=1 | F00:P
[jira] [Created] (IMPALA-6334) test_compute_stats_tablesample failing on Isilon builds
David Knupp created IMPALA-6334: --- Summary: test_compute_stats_tablesample failing on Isilon builds Key: IMPALA-6334 URL: https://issues.apache.org/jira/browse/IMPALA-6334 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.12.0 Reporter: David Knupp Priority: Critical MainThread: Comparing QueryTestResults (expected vs actual): Expected: {noformat} 3660,3660,12,regex:.*B,'NOT CACHED','NOT CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart' {noformat} Actual: {noformat} 3661,3661,12,'238.68KB','NOT CACHED','NOT CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart' {noformat} Stacktrace {noformat} self = vector = unique_database = 'test_compute_stats_tablesample_16dd5daf' @CustomClusterTestSuite.with_args(impalad_args=('--enable_stats_extrapolation=true')) def test_compute_stats_tablesample(self, vector, unique_database): # Create a partitioned and unpartitioned text table. Use the existing files from # functional.alltypes as data because those have a known, stable file size. This # test is sensitive to changes in file sizes across test runs because the sampling # is file based. Creating test tables with INSERT does not guarantee that the same # file sample is selected across test runs, even with REPEATABLE. # Create partitioned test table. External to avoid dropping files from alltypes. part_test_tbl = unique_database + ".alltypes" self.client.execute( "create external table %s like functional.alltypes" % part_test_tbl) alltypes_loc = self._get_table_location("functional.alltypes", vector) for m in xrange(1, 13): part_loc = path.join(alltypes_loc, "year=2009/month=%s" % m) self.client.execute( "alter table %s add partition (year=2009,month=%s) location '%s'" % (part_test_tbl, m, part_loc)) # Create unpartitioned test table. nopart_test_tbl = unique_database + ".alltypesnopart" self.client.execute("drop table if exists %s" % nopart_test_tbl) self.client.execute( "create table %s like functional.alltypesnopart" % nopart_test_tbl) nopart_test_tbl_loc = self._get_table_location(nopart_test_tbl, vector) # Remove NameNode prefix and first '/' because PyWebHdfs expects that if nopart_test_tbl_loc.startswith(NAMENODE): nopart_test_tbl_loc = nopart_test_tbl_loc[len(NAMENODE)+1:] for m in xrange(1, 13): src_part_loc = alltypes_loc + "/year=2009/month=%s" % m # Remove NameNode prefix and first '/' because PyWebHdfs expects that if src_part_loc.startswith(NAMENODE): src_part_loc = src_part_loc[len(NAMENODE)+1:] file_names = self.filesystem_client.ls(src_part_loc) for f in file_names: self.filesystem_client.copy(path.join(src_part_loc, f), path.join(nopart_test_tbl_loc, f)) self.client.execute("refresh %s" % nopart_test_tbl) > self.run_test_case('QueryTest/compute-stats-tablesample', vector, > unique_database) custom_cluster/test_stats_extrapolation.py:84: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ common/impala_test_suite.py:424: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:297: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:404: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ expected_results = actual_results = def verify_query_result_is_equal(expected_results, actual_results): assert_args_not_none(expected_results, actual_results) > assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E 3660,3660,12,regex:.*B,'NOT CACHED','NOT CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart' != 3661,3661,12,'238.68KB','NOT CACHED','NOT CACHED','TEXT','false','hdfs://10.17.95.12:8020/test-warehouse/test_compute_stats_tablesample_16dd5daf.db/alltypesnopart' common/test_result_verifier.py:231: AssertionError {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IMPALA-6317) Expose -cmake_only flag to buildall.sh
David Knupp created IMPALA-6317: --- Summary: Expose -cmake_only flag to buildall.sh Key: IMPALA-6317 URL: https://issues.apache.org/jira/browse/IMPALA-6317 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 2.11.0 Reporter: David Knupp Priority: Minor Impala/bin/make_impala.sh has a {{-cmake_only}} command line option: {noformat} -cmake_only) CMAKE_ONLY=1 {noformat} Passing this flag means that makefiles only will be generated during the build. However, this flag is not provided in buildall.sh (the caller of make_impala.sh) which effectively renders it useless. It turns out that if one has no intention of running the Impala cluster locally (e.g., as when trying to build just enough of the toolchain and dev environment to run the data load scripts for loading data onto a remote cluster) then being able to only generate makefiles is a useful thing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (IMPALA-6306) Impalad becomes unreachable trying to load tpch_nested_parquet data
[ https://issues.apache.org/jira/browse/IMPALA-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6306. - Resolution: Not A Bug > Impalad becomes unreachable trying to load tpch_nested_parquet data > --- > > Key: IMPALA-6306 > URL: https://issues.apache.org/jira/browse/IMPALA-6306 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 2.11.0 >Reporter: David Knupp > > I've been trying (unsuccessfully) to create the tpch_nested_parquet database > on a remote cluster using Impala's standard data load scripts. I finally > confirmed that I can complete the data load process if I simply comment out > the lines from {{testdata/bin/create-load-data.sh}} that calls load_nested.py: > {noformat} > # run-step "Loading nested data" load-nested.log \ > # ${IMPALA_HOME}/testdata/bin/load_nested.py ${LOAD_NESTED_ARGS:-} > {noformat} > With the all of the other data completely loaded, I tried to run > load_nested.py by hand, and saw this error: > {noformat} > systest@remote-joe:~$ Impala/testdata/bin/load_nested.py --cm-host > impala-dataload-testing-1.vpc.cloudera.com > 2017-12-10 13:45:12,663 INFO:db_connection[234]:Creating database > tpch_nested_parquet > 2017-12-10 13:45:12,965 INFO:load_nested[98]:Creating temp orders (chunk 1 of > 1) > 2017-12-10 13:45:33,724 INFO:load_nested[128]:Creating temp customers (chunk > 1 of 1) > Traceback (most recent call last): > File "Impala/testdata/bin/load_nested.py", line 320, in > load() > File "Impala/testdata/bin/load_nested.py", line 130, in load > impala.execute("CREATE TABLE tmp_customer_string AS " + tmp_customer_sql) > File "/data1/systest/Impala/tests/comparison/db_connection.py", line 206, > in execute > return self._cursor.execute(sql, *args, **kwargs) > File > "/data1/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py", > line 304, in execute > self._wait_to_finish() # make execute synchronous > File > "/data1/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py", > line 380, in _wait_to_finish > raise OperationalError(resp.errorMessage) > impala.error.OperationalError: Cancelled due to unreachable impalad(s): > impala-dataload-testing-2.vpc.cloudera.com:22000 > {noformat} > From the impalad log: > {noformat} > I1210 13:45:12.356262 17040 Frontend.java:909] Compiling query: DESCRIBE > tpch_nested_parquet.part > I1210 13:45:12.358700 17040 Frontend.java:948] Compiled query. > I1210 13:45:12.358832 17040 jni-util.cc:211] > org.apache.impala.common.AnalysisException: Could not resolve path: > 'tpch_nested_parquet.part' > at org.apache.impala.analysis.Analyzer.resolvePath(Analyzer.java:800) > at org.apache.impala.analysis.Analyzer.resolvePath(Analyzer.java:753) > at > org.apache.impala.analysis.DescribeTableStmt.analyze(DescribeTableStmt.java:106) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:388) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:369) > at org.apache.impala.service.Frontend.analyzeStmt(Frontend.java:920) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1069) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156) > I1210 13:45:12.371624 17040 status.cc:125] AnalysisException: Could not > resolve path: 'tpch_nested_parquet.part' > @ 0x9597f9 impala::Status::Status() > @ 0xc9df62 impala::JniUtil::GetJniExceptionMsg() > @ 0xba2a7b impala::Frontend::GetExecRequest() > @ 0xbc0558 impala::ImpalaServer::ExecuteInternal() > @ 0xbc6858 impala::ImpalaServer::Execute() > @ 0xc2244e impala::ImpalaServer::ExecuteStatement() > @ 0x10a8326 > apache::hive::service::cli::thrift::TCLIServiceProcessor::process_ExecuteStatement() > @ 0x10a1f44 > apache::hive::service::cli::thrift::TCLIServiceProcessor::dispatchCall() > @ 0x929ecc apache::thrift::TDispatchProcessor::process() > @ 0xafa43f > apache::thrift::server::TAcceptQueueServer::Task::run() > @ 0xaf4d35 impala::ThriftThread::RunRunnable() > @ 0xaf5b12 > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0xd10b63 impala::Thread::SuperviseThread() > @ 0xd112a4 boost::detail::thread_data<>::run() > @ 0x128afda (unknown) > @ 0x7f2a6a7a8e25 start_thread > @ 0x7f2a6a4d634d __clone > I1210 13:45:12.371713 17040 impala-server.cc:992] UnregisterQuery(): > query_id=2748b77529da2004:7602cd6d > I1210 13