[Impala-ASF-CR] IMPALA-9438 Implement atomic operations for aarch64
huangtianhua...@gmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15316 Change subject: IMPALA-9438 Implement atomic operations for aarch64 .. IMPALA-9438 Implement atomic operations for aarch64 Change-Id: I84e907c1cd9b09d3329e6c836d492dba5f49f5ae --- A be/src/gutil/atomicops-internals-arm64.h M be/src/gutil/atomicops.h 2 files changed, 370 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/15316/1 -- To view, visit http://gerrit.cloudera.org:8080/15316 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I84e907c1cd9b09d3329e6c836d492dba5f49f5ae Gerrit-Change-Number: 15316 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5361/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 17 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Feb 2020 02:33:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15284 ) Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 28 Feb 2020 02:03:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 13 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 28 Feb 2020 02:03:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. IMPALA-6689: Speed up point lookup for Kudu primary key If all primary key columns of the Kudu table are in equivalence predicates pushed down to Kudu, Kudu will return at most one row. In this case, we can adjust the cardinality estimation to speed up point lookup. This patch sets the input and output cardinality as 1 if the number of primary key columns in equivalence predicates pushed down to Kudu equals the total number of primary key columns of the Kudu table, hence enable small query optimization. Testing: - Added test cases in following PlannerTest: small-query-opt.test, disable-codegen.test and kudu.test. - Passed all FE tests, including new test cases. Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Reviewed-on: http://gerrit.cloudera.org:8080/15250 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test 4 files changed, 180 insertions(+), 4 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 14 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#17). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - thrift_sasl.py was updated to match the current public alpha, 0.4a1 - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell with no authentication using Python 3.7.5 Opened TCP connection to localhost:21000 ... OR Starting Impala Shell with LDAP-based authentication using Python 2.7.12 Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.) What ultimately seemed like a better approach w
[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15293 ) Change subject: IMPALA-9424: Add six to shell/ext-py .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe Gerrit-Change-Number: 15293 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Feb 2020 01:37:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15293 ) Change subject: IMPALA-9424: Add six to shell/ext-py .. IMPALA-9424: Add six to shell/ext-py Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe Reviewed-on: http://gerrit.cloudera.org:8080/15293 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M LICENSE.txt M bin/rat_exclude_files.txt M infra/python/deps/requirements.txt M shell/.gitignore A shell/ext-py/six-1.14.0/CHANGES A shell/ext-py/six-1.14.0/CONTRIBUTORS A shell/ext-py/six-1.14.0/LICENSE A shell/ext-py/six-1.14.0/MANIFEST.in A shell/ext-py/six-1.14.0/README.rst A shell/ext-py/six-1.14.0/setup.cfg A shell/ext-py/six-1.14.0/setup.py A shell/ext-py/six-1.14.0/six.py A shell/ext-py/six-1.14.0/test_six.py A shell/ext-py/six-1.14.0/tox.ini M shell/packaging/requirements.txt 15 files changed, 2,563 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe Gerrit-Change-Number: 15293 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@633 PS4, Line 633: Using bzip2, gzip, Snappy-Compressed, or zstd Text Files I saw the other code review https://gerrit.cloudera.org/c/15310/, do we need to add deflate here as well? http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@653 PS4, Line 653: or zstd-compressed text file is processed, the node doing the : work reads the entire file into memory and then decompresses it. Therefore, the node must : have enough memory to hold both the compressed and uncompressed data from the text file For text zstd decompression, we're using streaming, which doesn't load all at once. It decompress as it read. To be notice is that this is not true for parquet, we're still using block decompression for parquet file. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 28 Feb 2020 00:00:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/546/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 22:53:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/15310 ) Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files .. Patch Set 1: (2 comments) We should also update text specific documentation page: https://impala.apache.org/docs/build/html/topics/impala_txtfile.html http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml File docs/topics/impala_file_formats.xml: http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml@a280 PS1, Line 280: Instead of removing we could probably add: Supported for Text, Avro, RC and Sequence files. http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml@151 PS1, Line 151: LZO, gzip, bzip2, Snappy I think we should add Deflate here. -- To view, visit http://gerrit.cloudera.org:8080/15310 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40 Gerrit-Change-Number: 15310 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 22:17:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/15/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/15132/15/tests/shell/test_shell_commandline.py@485 PS15, Line 485: if SHELL_IS_PYTHON_2: > The primary assert happens above this line, regardless of python version. I got confused by the scenario, but I checked this out and played around and understand now. I think this regression is kinda bad, this would affect anyone using non-ascii characters with HS2. I think if we don't fix it it's important to check that the fallback logic also works for HS2, since that's the behaviour in this patchset (so the test shouldn't be skipped either way). FWIW I played around to understand and it looks like the issue on python2 is fixed by this, at least when I'm running impala-shell.sh +column_names = [cname.decode('utf8') for cname in self.imp_client.get_column_names(self.last_query_handle)] -column_names = self.imp_client.get_column_names(self.last_query_handle) impala-shell.sh -q 'select "?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,"' -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 15 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Feb 2020 22:11:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Hello Andrew Sherman, Abhishek Rawat, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15304 to look at the new patch set (#4). Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. IMPALA-9389: [DOCS] Support reading zstd text files In impala_txtfile.xml: - Line 650, changed to suggested "Impala can read compressed ..." - Line 676, zstd to .zst In impala_file_formats.xml: - Line 315, removed misleading sentence leaving "For Parquest and text files only" Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce --- M docs/topics/impala_file_formats.xml M docs/topics/impala_txtfile.xml 2 files changed, 26 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15304/4 -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/546/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 22:01:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15284 ) Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5360/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 27 Feb 2020 21:45:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 13 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 27 Feb 2020 21:36:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5425/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 13 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 27 Feb 2020 21:37:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15284 ) Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5424/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 27 Feb 2020 21:35:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15284 ) Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15284/4/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/15284/4/tests/shell/test_shell_commandline.py@859 PS4, Line 859: 20 After fixing the above error so that this test is actually running again, it started failing in Jenkins runs. As far as I can tell, the problem is really just that the shell is genuinely slower than this timeout for large query files. For comparison, it takes about 7 seconds running locally for me. I dug into it, and of that about 4 seconds are spent in parse_query_text, which uses some sqlparse functions to split the query text into multiple queries. About 2 seconds are spent in sqlparse.split() and another 2 seconds are spend in strip_comments() That seems like an unreasonable overhead for what its accomplishing. For the sake of getting this patch in, I would prefer to just extend the timeout for now, but we should probably think about how we can improve this, since otherwise impala-shell has pretty bad performance for even moderately large queries. I filed IMPALA-9436 -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 27 Feb 2020 21:34:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15293 ) Change subject: IMPALA-9424: Add six to shell/ext-py .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5423/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe Gerrit-Change-Number: 15293 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Feb 2020 21:10:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15293 ) Change subject: IMPALA-9424: Add six to shell/ext-py .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe Gerrit-Change-Number: 15293 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Feb 2020 21:10:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15284 ) Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/15284/1/shell/make_shell_tarball.sh File shell/make_shell_tarball.sh: http://gerrit.cloudera.org:8080/#/c/15284/1/shell/make_shell_tarball.sh@123 PS1, Line 123: cp ${SHELL_HOME}/shell_exceptions.py ${TARBALL_ROOT}/lib > "${SHELL_HOME}/exceptions.py" Done http://gerrit.cloudera.org:8080/#/c/15284/1/shell/packaging/make_python_package.sh File shell/packaging/make_python_package.sh: http://gerrit.cloudera.org:8080/#/c/15284/1/shell/packaging/make_python_package.sh@59 PS1, Line 59: cp "${SHELL_HOME}/shell_exceptions.py" "${MODULE_LIB_DIR}" > "${SHELL_HOME}/exceptions.py" Done http://gerrit.cloudera.org:8080/#/c/15284/1/shell/util.py File shell/util.py: http://gerrit.cloudera.org:8080/#/c/15284/1/shell/util.py@1 PS1, Line 1: > Well, now that you've started this, I think this file should actually be ca So I ran into a problem with this because 'exceptions' is a built in python module already, so you can't do something like "from exceptions import RPCException". I solved it by naming the file "shell_exceptions" instead -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 27 Feb 2020 20:59:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header
Hello David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15284 to look at the new patch set (#4). Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header .. IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header The 'Expect: 100-continue' http header allows http clients to send only the headers for their request, get a confirmation back from the server that the headers are valid, and only then send the body of the request, avoiding the overhead of sending large requests that will ultimately fail. This patch adds support for this in the HS2 HTTP server by having THttpServer look for the header, and if it's present and the request is validated returning a '100 Continue' response before reading the body of the request. It also adds supports for using this header on large requests sent by impala-shell. Testing: - This case is covered by the existing test_large_sql, however that test was previously broken and passing spuriously. This patch fixes the test. Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b --- M be/src/transport/THttpServer.cpp M be/src/transport/THttpTransport.cpp M be/src/transport/THttpTransport.h M shell/THttpClient.py M shell/impala_client.py M shell/impala_shell.py M shell/make_shell_tarball.sh M shell/packaging/make_python_package.sh A shell/shell_exceptions.py M tests/shell/test_shell_commandline.py 10 files changed, 99 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/15284/4 -- To view, visit http://gerrit.cloudera.org:8080/15284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Gerrit-Change-Number: 15284 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15310 ) Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/545/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15310 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40 Gerrit-Change-Number: 15310 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 20:03:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9228: ORC scanner reads rows into scratch batch
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15104 ) Change subject: IMPALA-9228: ORC scanner reads rows into scratch batch .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5359/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca Gerrit-Change-Number: 15104 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 19:58:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15310 ) Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/545/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15310 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40 Gerrit-Change-Number: 15310 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 19:55:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files
kh...@cloudera.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15310 Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files .. IMPALA-9431 [DOCS] Remove Deflate not supported for text files Removed "Not supported for text files" Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40 --- M docs/topics/impala_file_formats.xml 1 file changed, 0 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/15310/1 -- To view, visit http://gerrit.cloudera.org:8080/15310 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40 Gerrit-Change-Number: 15310 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_file_formats.xml File docs/topics/impala_file_formats.xml: http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_file_formats.xml@315 PS3, Line 315: For Parquet and text files only. Impala can read zstd-encoded text files written by Hive "Impala can read zstd-encoded text files written by Hive (streaming) or compressed by the zStandard library (block)." I think this is slightly misleading since the above is entirely true for zstd compressed text files. Also streaming/block are internal details which we don't necessarily have to put in the documentation. For zstd compressed Parquet files, we support both reading and writing . This statement would be misleading since it seems we only support reading. Also, compressing parquet files requires page level compression and so if someone uses the zstd lib to compress a parquet file (and not doing page level compression) Impala cannot read/uncompress it. IMPALA-9201 is a related JIRA. I think I am happy with just having following here: "For Parquet and text files only" In other parts of documentation we anyways cover the fact that Impala can only read text compressed files and this is no different for the new zstd support. And it can read/write parquet compressed files. http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml@650 PS3, Line 650: capability. Impala can read zstd-encoded text files written by Hive (streaming) or compressed I don't think it is necessary to document the details such as streaming/block. It doesn't help the documentation but only raises more questions. Also, I am not sure this is only true for zstd. I would think this is true for other "text" compression formats also. And if that is the case we probably should just add a generic statement something like this: "Impala can read compressed text files written by Hive or compressed by the standard library implementation" @Xiaomeng could you please confirm this? I think this is true for all supported text compression codecs. http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml@676 PS3, Line 676: .gz, .snappy, or zstd. The extensions I think the extension is '.zst' -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 19:43:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9228: ORC scanner reads rows into scratch batch
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15104 to look at the new patch set (#4). Change subject: IMPALA-9228: ORC scanner reads rows into scratch batch .. IMPALA-9228: ORC scanner reads rows into scratch batch Because of performance considerations this change enhances ORC scanner to populate a scratch batch on a column-by-column manner using data from the column readers. Once this is done the parquet code was reused to apply runtime filter and conjuncts and to populate the outgoing row batch. This approach reduces the number of virtual function calls and takes advantage of the columnar orientation of the data to enhance scan performance. Additionally, introducing the scratch batch concept also opens the door for codegen runtime filtering and applying conjuncts. Tesing: - Re-run the full test suite to verify that no regression is introduced. - Checked the performance impact by running TPCH workload on a scale 25 database using single_node_perf_run.py. The total query runtime is decreased by 0-20% depending on how scan heavy the particular query was. The more scan heavy the query is the more performance gain I observe. Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/CMakeLists.txt R be/src/exec/hdfs-columnar-scanner-ir.cc A be/src/exec/hdfs-columnar-scanner.cc A be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scanner.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h R be/src/exec/scratch-tuple-batch.h 15 files changed, 464 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15104/4 -- To view, visit http://gerrit.cloudera.org:8080/15104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca Gerrit-Change-Number: 15104 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/544/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 18:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/544/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 18:47:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Hello Andrew Sherman, Abhishek Rawat, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15304 to look at the new patch set (#3). Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. IMPALA-9389: [DOCS] Support reading zstd text files In impala_txtfile.xml: - Line 633 (was 644), added zstd. - Line 650, described zstd file that can be read. - In table Text Format Support in Impala, added zstd to column Compression Codecs In impala_file_formats.xml: - In the table, added zstd to column Compression Codecs for file type Text - Updated definition list for Zstd term to cover text files. Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce --- M docs/topics/impala_file_formats.xml M docs/topics/impala_txtfile.xml 2 files changed, 27 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15304/3 -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] PoC: IMPALA-9228: ORC scanner reads rows into scratch batch
Gabor Kaszab has restored this change. ( http://gerrit.cloudera.org:8080/15104 ) Change subject: PoC: IMPALA-9228: ORC scanner reads rows into scratch batch .. Restored -- To view, visit http://gerrit.cloudera.org:8080/15104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: restore Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca Gerrit-Change-Number: 15104 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 15: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5421/ -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 15 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 18:15:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 2: (1 comment) I think (Abhishek and Xiaomeng to confirm) that zstd is most easily explained as being another form of text compression, like gzip, bzip2, or Snappy. http://gerrit.cloudera.org:8080/#/c/15304/2/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/2/docs/topics/impala_txtfile.xml@644 PS2, Line 644: Using gzip, bzip2, or Snappy-Compressed Text Files Maybe this is the right place to add doc for zstd? I think zstd is like these other formats. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 27 Feb 2020 17:32:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15179 ) Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. IMPALA-8690 (prep 3): Factor out common code for cache implementations The existing cache implementation handles LRU/FIFO eviction algorithms, but it also implements several components that are useful for other cache implementations. Specifically, there is a simple hash table (HandleTable) and a simple sharding implementation (ShardedCache). These can be reused across other eviction algorithms by making them generic. This pulls them out of be/src/util/cache/cache.cc and into a new be/src/util/cache/cache-internal.h file. To make the HandleTable generic, this introduces a HandleBase class that contains common code between the implementations (such as the key, the value, the hash, etc). The HandleTable works on this base class, and the RLHandle now derives from HandleBase. To support this, Allocate/Free needs to treat the handle as an object (calling constructors/destructors) rather than treating it like a chunk of memory (or simple struct). ShardedCache is made generic by having cache implementations derive from a base CacheShard class that defines the appropriate methods needed by the sharding class. This is purely an interface, and the base class defines no functions. The existing CacheShard is renamed to RLCachedShard and derives from this class. Testing: - Release core run with a data cache enabled - ASAN core - The cache-test backend test continues to pass Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Reviewed-on: http://gerrit.cloudera.org:8080/15179 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- A be/src/util/cache/cache-internal.h M be/src/util/cache/cache.cc M bin/rat_exclude_files.txt 3 files changed, 426 insertions(+), 286 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15308 ) Change subject: IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5358/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef Gerrit-Change-Number: 15308 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 16:43:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15308 Change subject: IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled .. IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled When a runtime filter has remote target, coordinator will Disable the FilterState upon arrival of the last filter update to prevent another update towards that filter. As consequence, such runtime filter will always be displayed as disabled in runtime profile (Enabled column is equal to false in Final filter table), when in reality the runtime filter is complete and successfully published to all remote targets. The Enabled column should correctly distinguish between failed runtime filter vs complete runtime filter. To do so, we add all_publish_complete_ flag in FilterState class and set it to true upon completion of the final runtime filter publish. If all_publish_complete_ is true, then mark that runtime filter as enabled. Testing: - Add row regex in runtime_filters.test, query 6, to verify REMOTE runtime filter is marked as enabled in final filter table - Run and pass test_runtime_filters.py - Run and pass core tests Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test 4 files changed, 16 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/15308/1 -- To view, visit http://gerrit.cloudera.org:8080/15308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef Gerrit-Change-Number: 15308 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 12: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5420/ -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 12 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 27 Feb 2020 15:58:03 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: (3 comments) The code LGTM, only found a few nits. http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h@38 PS16, Line 38: Decide memory order. I think mem_order_relaxed should be enough as we : /// only need atomicity and the pointers can be set independently of each other. nit: I think you can delete this part http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h@465 PS16, Line 465: / nit: we usually use /// for doc comments. http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc@358 PS16, Line 358: / nit: too many / -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 14:02:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5421/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 15 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 15 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 14: Code-Review+2 Thanks for applying the changes! -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 14 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 14 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 13:20:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5357/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 14 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 11:48:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 12: Restarted the build as the failure was unrelated. -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 12 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 27 Feb 2020 11:31:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15250 ) Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5420/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08 Gerrit-Change-Number: 15250 Gerrit-PatchSet: 12 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 27 Feb 2020 11:30:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15297 ) Change subject: IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain .. Patch Set 1: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5419/ -- To view, visit http://gerrit.cloudera.org:8080/15297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I012314793ffb521001951ab7ec3d7a3ba737c405 Gerrit-Change-Number: 15297 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Feb 2020 11:23:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 14: (2 comments) http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc File be/src/exec/orc-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc@180 PS12, Line 180: >( > I think that >= offsets.size() - 1 is needed, because of the offsets[index Done http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc@184 PS12, Line 184: src_ptr = blob_ + offsets[index]; : src_len = offsets[index + 1] - offsets[index]; > I think that we cannot trust completely in the length values at the moment Thanks Csaba for investigating, uploaded a PS for the above check, will open ORC jiras for both of your comments. -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 14 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 11:02:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. IMPALA-9226: Improve string allocations of the ORC scanner Currently the OrcColumnReader copies values from the orc::StringVectorBatch one-by-one. Since ORC 1.6, the blob which contains the pointed values is moved to the StringVectorBatch, so we can copy it. This commit beside the above improvement also enables the LazyEncoding option for the ORC reader. This way, for stripes with DICTIONARY_ENCODING[_V2], EncodedStringVectorBatch contains the data in a dictionaryBlob from which the data can be acquired with the given indices and lengths. Tests: * Run ORC scanner tests (query_tests/test_scanners.py::TestOrc) and tpch query tests. * Tested performance on tpch.lineitem table with scale=25, running queries that selects min of string columns. Some results: col_name | encoding | before | after | speedup = l_comment DIRECT 16.42s 14.38s 14% l_shipinstruct DICTIONARY 5.26s3.80s 32% l_commitdate DICTIONARY 5.46s5.19s 5% all string col BOTH 39.06s 32.18s 21% The queries were run on a desktop PC with MT_DOP and NUM_NODES set to 1. * Also run TPC-H queries on the TPC-H benchmark where some queries' runtime improved by around 10-15%, while there were no regression for the others. Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h 4 files changed, 135 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/15051/14 -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 14 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5356/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 09:54:02 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: Rebased and conflicts resolved. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 09:08:16 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 521 insertions(+), 229 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/16 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6360: Don't show full query statement on Impala webUI by default. Added the ‘query stmt size’ flag to impala-server.cc with default value of 250 and modified the ‘ImpalaHttpHand
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/15288 ) Change subject: IMPALA-6360: Don't show full query statement on Impala webUI by default. Added the ‘query_stmt_size’ flag to impala-server.cc with default value of 250 and modified the ‘ImpalaHttpHandler::QueryStateToJson()’ to truncate the end of the statements if they .. Patch Set 4: (4 comments) Hi Tamas, thank you for the change. Just a few nits. http://gerrit.cloudera.org:8080/#/c/15288/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15288/4//COMMIT_MSG@7 PS4, Line 7: IMPALA-6360: Don't show full query statement on Impala webUI by default. : Added the ‘query_stmt_size’ flag to impala-server.cc with default value nit: missing blank line between subject and body nit: period is not needed at the end of subject http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-http-handler.cc@374 PS4, Line 374: Value stmt((FLAGS_query_stmt_size) ? (tmp_stmt.length() > FLAGS_query_stmt_size) ? : tmp_stmt.substr(0, FLAGS_query_stmt_size).append("...").c_str() : tmp_stmt.c_str() : : tmp_stmt.c_str(), document->GetAllocator()); This is a bit hard to read at first, if we would change the first ternary operator to if then the code would better document itself and the comment could be removed, similar to line 401 in this file. http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-server.cc@151 PS4, Line 151: has nit: have http://gerrit.cloudera.org:8080/#/c/15288/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/15288/4/tests/webserver/test_web_pages.py@425 PS4, Line 425: lacus at risus bibendum, id pulvinar ligula lobortis. Fusce lacinia nibh in : volutpat iaculis. Cras vite dignissim ligula. Fusce sollici.Proin bibendum erat : eu libero iaculis pharetra. Duis efficitur lacus at risus bibendum, id pulvinar : ligula lobortis. Fusce lacinia nibh in volutpat iaculis. Cras vite dignissim : ligula. Fusce sollici.\ nit: missing identation -- To view, visit http://gerrit.cloudera.org:8080/15288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7109a0be5d1022b4f8d6e72441cf5dc1dc42605 Gerrit-Change-Number: 15288 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Thu, 27 Feb 2020 08:41:03 + Gerrit-HasComments: Yes