[Impala-ASF-CR] IMPALA-9438 Implement atomic operations for aarch64

2020-02-27 Thread Anonymous Coward (Code Review)
huangtianhua...@gmail.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15316


Change subject: IMPALA-9438 Implement atomic operations for aarch64
..

IMPALA-9438 Implement atomic operations for aarch64

Change-Id: I84e907c1cd9b09d3329e6c836d492dba5f49f5ae
---
A be/src/gutil/atomicops-internals-arm64.h
M be/src/gutil/atomicops.h
2 files changed, 370 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/15316/1
--
To view, visit http://gerrit.cloudera.org:8080/15316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I84e907c1cd9b09d3329e6c836d492dba5f49f5ae
Gerrit-Change-Number: 15316
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5361/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 17
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Feb 2020 02:33:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15284 )

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 28 Feb 2020 02:03:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 13: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 28 Feb 2020 02:03:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..

IMPALA-6689: Speed up point lookup for Kudu primary key

If all primary key columns of the Kudu table are in equivalence
predicates pushed down to Kudu, Kudu will return at most one row.
In this case, we can adjust the cardinality estimation to speed
up point lookup.
This patch sets the input and output cardinality as 1 if the
number of primary key columns in equivalence predicates pushed
down to Kudu equals the total number of primary key columns of
the Kudu table, hence enable small query optimization.

Testing:
 - Added test cases in following PlannerTest: small-query-opt.test,
   disable-codegen.test and kudu.test.
 - Passed all FE tests, including new test cases.

Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Reviewed-on: http://gerrit.cloudera.org:8080/15250
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
4 files changed, 180 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 14
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-27 Thread David Knupp (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15132

to look at the new patch set (#17).

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..

IMPALA-3343: Make impala-shell compatible with python 3.

This patch makes the impala-shell code cross-compatible with python 2 and
python 3. The goal is wind up with a version of the shell that will pass
python e2e tests irrepsective of the version of python used to launch the shell,
under the assumption that the test framework itself will continue to run with
python 2.7.x.

There are a few isolated tests that weren't able to pass under both versions,
and the reasons have been documented in comments in the test themselves.

Notable changes for reviewers to consider:

- With regard to validating the patch, my assumption is that simply passing
  the existing set of e2e shell tests is sufficient to confirm that the shell
  is functioning properly. No new tests were added.

- Many of the simpler changes derive from the fact that a few built-in functions
  and/or types have either been removed or have else changed in python 3.x,
  E.g., xrange and basestring are both gone, dict.iteritems() has been removed,
  dict.items() behaves differently, the unicode() function and the method
  str.decode() have both been removed, etc.

  Also, catching exceptions using "Exception, e" no longer works, and (as most
  know), using print() as a function is required now.

- A new pytest command line option was added in conftest.py to enable a user
  to specify a path to an alternate impala-shell executable to test. It's
  possible to use this to point to an instance of the impala-shell that was
  installed as a standalone python package in a separate virtualenv.

  Example usage:
  USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py

  The target virtualenv may be based on either python3 or python2. However,
  this has no effect on the version of python used to run the test framework,
  which remains tied to python 2.7.x for the foreseeable future.

- $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11
  gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for
  testing a version of the impala-shell in a python3-based virtualenv.

- thrift_sasl.py was updated to match the current public alpha, 0.4a1

- The wording of the header changed a bit to include the python version
  used to run the shell.

Starting Impala Shell with no authentication using Python 3.7.5
Opened TCP connection to localhost:21000
...

OR

Starting Impala Shell with LDAP-based authentication using Python 2.7.12
Opened TCP connection to localhost:21000
...

- By far, the biggest hassle has been juggling str versus unicode versus
  bytes data types. Python 2.x was fairly loose and inconsistent in
  how it dealt with strings. As a quick demo of what I mean:

  Python 2.7.12 (default, Nov 12 2018, 14:36:49)
  [GCC 5.4.0 20160609] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> d = 'like a duck'
  >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == 
d.decode('utf-8')
  True

  ...and yet there are weird unexpected gotchas.

  >>> d.decode('utf-8') == d.encode('utf-8')
  True
  >>> d.encode('utf-8') == bytearray(d, 'utf-8')
  True
  >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
  False

  As a result of this, the way we handled strings in the impala-shell code had
  become equally loose and inconsistent -- mainly in the form of frequent and
  liberal use of str.encode() and str.decode() -- but things still just worked.

  In python3, there's a much clearer distinction between strings and bytes, and
  as such, much tighter type consistency is expected by standard libs like
  subprocess, re, sqlparse, prettytable, etc., which are used throughout the
  shell. Even simple calls that worked in python 2.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  ['foo']

  ...can throw exceptions in python 3.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  Traceback (most recent call last):
File "", line 1, in 
File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
  return _compile(pattern, flags).findall(string)
  TypeError: cannot use a string pattern on a bytes-like object

  Exceptions like this resulted in a many, if not most shell tests failing
  under python 3.

  At first, I tried to go one-by-one to the site of each failure, and correct
  by checking instance type and re-encoding as necessary, but this only led to
  even more str.encode() calls littering the code, which just seemed like a
  code-smell. (Wiki "code smell" if you don't know the term.)

  What ultimately seemed like a better approach w

[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15293 )

Change subject: IMPALA-9424: Add six to shell/ext-py
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Gerrit-Change-Number: 15293
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Feb 2020 01:37:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15293 )

Change subject: IMPALA-9424: Add six to shell/ext-py
..

IMPALA-9424: Add six to shell/ext-py

Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Reviewed-on: http://gerrit.cloudera.org:8080/15293
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M LICENSE.txt
M bin/rat_exclude_files.txt
M infra/python/deps/requirements.txt
M shell/.gitignore
A shell/ext-py/six-1.14.0/CHANGES
A shell/ext-py/six-1.14.0/CONTRIBUTORS
A shell/ext-py/six-1.14.0/LICENSE
A shell/ext-py/six-1.14.0/MANIFEST.in
A shell/ext-py/six-1.14.0/README.rst
A shell/ext-py/six-1.14.0/setup.cfg
A shell/ext-py/six-1.14.0/setup.py
A shell/ext-py/six-1.14.0/six.py
A shell/ext-py/six-1.14.0/test_six.py
A shell/ext-py/six-1.14.0/tox.ini
M shell/packaging/requirements.txt
15 files changed, 2,563 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Gerrit-Change-Number: 15293
Gerrit-PatchSet: 5
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Xiaomeng Zhang (Code Review)
Xiaomeng Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml
File docs/topics/impala_txtfile.xml:

http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@633
PS4, Line 633: Using bzip2, gzip, Snappy-Compressed, or zstd Text 
Files
I saw the other code review https://gerrit.cloudera.org/c/15310/, do we need to 
add deflate here as well?


http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@653
PS4, Line 653: or zstd-compressed text file is processed, the node doing the
 : work reads the entire file into memory and then 
decompresses it. Therefore, the node must
 : have enough memory to hold both the compressed and 
uncompressed data from the text file
For text zstd decompression, we're using streaming, which doesn't load all at 
once. It decompress as it read.
To be notice is that this is not true for parquet, we're still using block 
decompression for parquet file.



--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Fri, 28 Feb 2020 00:00:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/546/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 22:53:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files

2020-02-27 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15310 )

Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files
..


Patch Set 1:

(2 comments)

We should also update text specific documentation page: 
https://impala.apache.org/docs/build/html/topics/impala_txtfile.html

http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml
File docs/topics/impala_file_formats.xml:

http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml@a280
PS1, Line 280:
Instead of removing we could probably add:
Supported for Text, Avro, RC and Sequence files.


http://gerrit.cloudera.org:8080/#/c/15310/1/docs/topics/impala_file_formats.xml@151
PS1, Line 151:   LZO, gzip, bzip2, Snappy
I think we should add Deflate here.



--
To view, visit http://gerrit.cloudera.org:8080/15310
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
Gerrit-Change-Number: 15310
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 22:17:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15132/15/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/15132/15/tests/shell/test_shell_commandline.py@485
PS15, Line 485:   if SHELL_IS_PYTHON_2:
> The primary assert happens above this line, regardless of python version.
I got confused by the scenario, but I checked this out and played around and 
understand now.

I think this regression is kinda bad, this would affect anyone using non-ascii 
characters with HS2. I think if we don't fix it it's important to check that 
the fallback logic also works for HS2, since that's the behaviour in this 
patchset (so the test shouldn't be skipped either way).

FWIW I played around to understand and it looks like the issue on python2 is 
fixed by this, at least when I'm running impala-shell.sh

+column_names = [cname.decode('utf8') for cname in 
self.imp_client.get_column_names(self.last_query_handle)]
-column_names = self.imp_client.get_column_names(self.last_query_handle)

   impala-shell.sh -q 'select "?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 
?, ?,"'



--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 15
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Feb 2020 22:11:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Anonymous Coward (Code Review)
Hello Andrew Sherman, Abhishek Rawat, Xiaomeng Zhang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15304

to look at the new patch set (#4).

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..

IMPALA-9389: [DOCS] Support reading zstd text files

In impala_txtfile.xml:
- Line 650, changed to suggested "Impala can read compressed ..."
- Line 676, zstd to .zst
In impala_file_formats.xml:
- Line 315, removed misleading sentence leaving "For Parquest and text files 
only"

Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
---
M docs/topics/impala_file_formats.xml
M docs/topics/impala_txtfile.xml
2 files changed, 26 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15304/4
--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/546/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 22:01:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15284 )

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5360/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:45:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:36:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5425/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:37:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15284 )

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5424/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:35:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15284 )

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15284/4/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/15284/4/tests/shell/test_shell_commandline.py@859
PS4, Line 859: 20
After fixing the above error so that this test is actually running again, it 
started failing in Jenkins runs.

As far as I can tell, the problem is really just that the shell is genuinely 
slower than this timeout for large query files.

For comparison, it takes about 7 seconds running locally for me. I dug into it, 
and of that about 4 seconds are spent in parse_query_text, which uses some 
sqlparse functions to split the query text into multiple queries. About 2 
seconds are spent in sqlparse.split() and another 2 seconds are spend in 
strip_comments()

That seems like an unreasonable overhead for what its accomplishing. For the 
sake of getting this patch in, I would prefer to just extend the timeout for 
now, but we should probably think about how we can improve this, since 
otherwise impala-shell has pretty bad performance for even moderately large 
queries.

I filed IMPALA-9436



--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:34:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15293 )

Change subject: IMPALA-9424: Add six to shell/ext-py
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5423/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Gerrit-Change-Number: 15293
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:10:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9424: Add six to shell/ext-py

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15293 )

Change subject: IMPALA-9424: Add six to shell/ext-py
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Gerrit-Change-Number: 15293
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Feb 2020 21:10:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15284 )

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15284/1/shell/make_shell_tarball.sh
File shell/make_shell_tarball.sh:

http://gerrit.cloudera.org:8080/#/c/15284/1/shell/make_shell_tarball.sh@123
PS1, Line 123: cp ${SHELL_HOME}/shell_exceptions.py ${TARBALL_ROOT}/lib
> "${SHELL_HOME}/exceptions.py"
Done


http://gerrit.cloudera.org:8080/#/c/15284/1/shell/packaging/make_python_package.sh
File shell/packaging/make_python_package.sh:

http://gerrit.cloudera.org:8080/#/c/15284/1/shell/packaging/make_python_package.sh@59
PS1, Line 59:   cp "${SHELL_HOME}/shell_exceptions.py" "${MODULE_LIB_DIR}"
> "${SHELL_HOME}/exceptions.py"
Done


http://gerrit.cloudera.org:8080/#/c/15284/1/shell/util.py
File shell/util.py:

http://gerrit.cloudera.org:8080/#/c/15284/1/shell/util.py@1
PS1, Line 1:
> Well, now that you've started this, I think this file should actually be ca
So I ran into a problem with this because 'exceptions' is a built in python 
module already, so you can't do something like "from exceptions import 
RPCException". I solved it by naming the file "shell_exceptions" instead



--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 27 Feb 2020 20:59:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

2020-02-27 Thread Thomas Tauber-Marshall (Code Review)
Hello David Knupp, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15284

to look at the new patch set (#4).

Change subject: IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http 
header
..

IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header

The 'Expect: 100-continue' http header allows http clients to send
only the headers for their request, get a confirmation back from the
server that the headers are valid, and only then send the body of the
request, avoiding the overhead of sending large requests that will
ultimately fail.

This patch adds support for this in the HS2 HTTP server by having
THttpServer look for the header, and if it's present and the request
is validated returning a '100 Continue' response before reading the
body of the request.

It also adds supports for using this header on large requests sent by
impala-shell.

Testing:
- This case is covered by the existing test_large_sql, however that
  test was previously broken and passing spuriously. This patch fixes
  the test.

Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
---
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M shell/THttpClient.py
M shell/impala_client.py
M shell/impala_shell.py
M shell/make_shell_tarball.sh
M shell/packaging/make_python_package.sh
A shell/shell_exceptions.py
M tests/shell/test_shell_commandline.py
10 files changed, 99 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/15284/4
--
To view, visit http://gerrit.cloudera.org:8080/15284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b
Gerrit-Change-Number: 15284
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15310 )

Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/545/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15310
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
Gerrit-Change-Number: 15310
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 20:03:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9228: ORC scanner reads rows into scratch batch

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15104 )

Change subject: IMPALA-9228: ORC scanner reads rows into scratch batch
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5359/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca
Gerrit-Change-Number: 15104
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 19:58:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15310 )

Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/545/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15310
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
Gerrit-Change-Number: 15310
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 19:55:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9431 [DOCS] Remove Deflate not supported for text files

2020-02-27 Thread Anonymous Coward (Code Review)
kh...@cloudera.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15310


Change subject: IMPALA-9431 [DOCS] Remove Deflate not supported for text files
..

IMPALA-9431 [DOCS] Remove Deflate not supported for text files

Removed "Not supported for text files"

Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
---
M docs/topics/impala_file_formats.xml
1 file changed, 0 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/15310/1
--
To view, visit http://gerrit.cloudera.org:8080/15310
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
Gerrit-Change-Number: 15310
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_file_formats.xml
File docs/topics/impala_file_formats.xml:

http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_file_formats.xml@315
PS3, Line 315: For Parquet and text files only. Impala can read 
zstd-encoded text files written by Hive
"Impala can read zstd-encoded text files written by Hive
  (streaming) or compressed by the zStandard library (block)."

I think this is slightly misleading since the above is entirely true for zstd 
compressed text files. Also streaming/block are internal details which we don't 
necessarily have to put in the documentation.

For zstd compressed Parquet files, we support both reading and writing . This 
statement would be misleading since it seems we only support reading. Also, 
compressing parquet files requires page level compression and so if someone 
uses the zstd lib to compress a parquet file (and not doing page level 
compression) Impala cannot read/uncompress it. IMPALA-9201 is a related JIRA.

I think I am happy with just having following here:
"For Parquet and text files only"

In other parts of documentation we anyways cover the fact that Impala can only 
read text compressed files and this is no different for the new zstd support. 
And it can read/write parquet compressed files.


http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml
File docs/topics/impala_txtfile.xml:

http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml@650
PS3, Line 650: capability. Impala can read zstd-encoded text files 
written by Hive (streaming) or compressed
I don't think it is necessary to document the details such as streaming/block. 
It doesn't help the documentation but only raises more questions.

Also, I am not sure this is only true for zstd. I would think this is true for 
other "text" compression formats also. And if that is the case we probably 
should just add a generic statement something like this:

"Impala can read compressed text files written by Hive or compressed by the 
standard library implementation"

@Xiaomeng could you please confirm this? I think this is true for all supported 
text compression codecs.


http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml@676
PS3, Line 676:   .gz, .snappy, or 
zstd. The extensions
I think the extension is '.zst'



--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 19:43:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9228: ORC scanner reads rows into scratch batch

2020-02-27 Thread Gabor Kaszab (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15104

to look at the new patch set (#4).

Change subject: IMPALA-9228: ORC scanner reads rows into scratch batch
..

IMPALA-9228: ORC scanner reads rows into scratch batch

Because of performance considerations this change enhances ORC
scanner to populate a scratch batch on a column-by-column manner
using data from the column readers. Once this is done the parquet
code was reused to apply runtime filter and conjuncts and to
populate the outgoing row batch.

This approach reduces the number of virtual function calls and takes
advantage of the columnar orientation of the data to enhance scan
performance. Additionally, introducing the scratch batch concept also
opens the door for codegen runtime filtering and applying conjuncts.

Tesing:
  - Re-run the full test suite to verify that no regression is
introduced.
  - Checked the performance impact by running TPCH workload on a
scale 25 database using single_node_perf_run.py. The total query
runtime is decreased by 0-20% depending on how scan heavy the
particular query was. The more scan heavy the query is the more
performance gain I observe.

Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/CMakeLists.txt
R be/src/exec/hdfs-columnar-scanner-ir.cc
A be/src/exec/hdfs-columnar-scanner.cc
A be/src/exec/hdfs-columnar-scanner.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
M be/src/exec/parquet/CMakeLists.txt
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
R be/src/exec/scratch-tuple-batch.h
15 files changed, 464 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15104/4
--
To view, visit http://gerrit.cloudera.org:8080/15104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca
Gerrit-Change-Number: 15104
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/544/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 18:56:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/544/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 18:47:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Anonymous Coward (Code Review)
Hello Andrew Sherman, Abhishek Rawat, Xiaomeng Zhang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15304

to look at the new patch set (#3).

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..

IMPALA-9389: [DOCS] Support reading zstd text files

In impala_txtfile.xml:
- Line 633 (was 644), added zstd.
- Line 650, described zstd file that can be read.
- In table Text Format Support in Impala, added zstd to column Compression 
Codecs
In impala_file_formats.xml:
- In the table, added zstd to column Compression Codecs for file type Text
- Updated definition list for Zstd term to cover text files.

Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
---
M docs/topics/impala_file_formats.xml
M docs/topics/impala_txtfile.xml
2 files changed, 27 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/15304/3
--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 


[Impala-ASF-CR] PoC: IMPALA-9228: ORC scanner reads rows into scratch batch

2020-02-27 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has restored this change. ( http://gerrit.cloudera.org:8080/15104 )

Change subject: PoC: IMPALA-9228: ORC scanner reads rows into scratch batch
..


Restored
--
To view, visit http://gerrit.cloudera.org:8080/15104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: restore
Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca
Gerrit-Change-Number: 15104
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663: Expose current DDL metrics on WebUI
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5421/


--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 18:15:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files

2020-02-27 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15304 )

Change subject: IMPALA-9389: [DOCS] Support reading zstd text files
..


Patch Set 2:

(1 comment)

I think (Abhishek and Xiaomeng to confirm) that zstd is most easily explained 
as being another form of text compression, like gzip, bzip2, or Snappy.

http://gerrit.cloudera.org:8080/#/c/15304/2/docs/topics/impala_txtfile.xml
File docs/topics/impala_txtfile.xml:

http://gerrit.cloudera.org:8080/#/c/15304/2/docs/topics/impala_txtfile.xml@644
PS2, Line 644: Using gzip, bzip2, or Snappy-Compressed Text 
Files
Maybe this is the right place to add doc for zstd?
I think zstd is like these other formats.



--
To view, visit http://gerrit.cloudera.org:8080/15304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce
Gerrit-Change-Number: 15304
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 27 Feb 2020 17:32:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations

2020-02-27 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15179 )

Change subject: IMPALA-8690 (prep 3): Factor out common code for cache 
implementations
..

IMPALA-8690 (prep 3): Factor out common code for cache implementations

The existing cache implementation handles LRU/FIFO eviction algorithms,
but it also implements several components that are useful for other
cache implementations. Specifically, there is a simple hash table
(HandleTable) and a simple sharding implementation (ShardedCache).
These can be reused across other eviction algorithms by making them
generic.

This pulls them out of be/src/util/cache/cache.cc and into a new
be/src/util/cache/cache-internal.h file. To make the HandleTable
generic, this introduces a HandleBase class that contains common
code between the implementations (such as the key, the value,
the hash, etc). The HandleTable works on this base class, and the
RLHandle now derives from HandleBase.

To support this, Allocate/Free needs to treat the handle as an object
(calling constructors/destructors) rather than treating it like
a chunk of memory (or simple struct).

ShardedCache is made generic by having cache implementations derive
from a base CacheShard class that defines the appropriate methods
needed by the sharding class. This is purely an interface, and the
base class defines no functions. The existing CacheShard is renamed
to RLCachedShard and derives from this class.

Testing:
 - Release core run with a data cache enabled
 - ASAN core
 - The cache-test backend test continues to pass

Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e
Reviewed-on: http://gerrit.cloudera.org:8080/15179
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
A be/src/util/cache/cache-internal.h
M be/src/util/cache/cache.cc
M bin/rat_exclude_files.txt
3 files changed, 426 insertions(+), 286 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/15179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e
Gerrit-Change-Number: 15179
Gerrit-PatchSet: 11
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15308 )

Change subject: IMPALA-8674: fix bug where REMOTE runtime filter always marked 
disabled
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5358/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef
Gerrit-Change-Number: 15308
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 16:43:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled

2020-02-27 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15308


Change subject: IMPALA-8674: fix bug where REMOTE runtime filter always marked 
disabled
..

IMPALA-8674: fix bug where REMOTE runtime filter always marked disabled

When a runtime filter has remote target, coordinator will Disable the
FilterState upon arrival of the last filter update to prevent another
update towards that filter. As consequence, such runtime filter will
always be displayed as disabled in runtime profile (Enabled column is
equal to false in Final filter table), when in reality the runtime
filter is complete and successfully published to all remote targets.
The Enabled column should correctly distinguish between failed runtime
filter vs complete runtime filter. To do so, we add
all_publish_complete_ flag in FilterState class and set it to true
upon completion of the final runtime filter publish. If
all_publish_complete_ is true, then mark that runtime filter as
enabled.

Testing:
- Add row regex in runtime_filters.test, query 6, to verify REMOTE
  runtime filter is marked as enabled in final filter table
- Run and pass test_runtime_filters.py
- Run and pass core tests

Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
4 files changed, 16 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/15308/1
--
To view, visit http://gerrit.cloudera.org:8080/15308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I82a5a776103abd0a6d73336bebc65e22b4e13fef
Gerrit-Change-Number: 15308
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 12:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5420/


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Feb 2020 15:58:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-27 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 16:

(3 comments)

The code LGTM, only found a few nits.

http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h
File be/src/codegen/codegen-fn-ptr.h:

http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h@38
PS16, Line 38: Decide memory order. I think mem_order_relaxed should be enough 
as we
 :   /// only need atomicity and the pointers can be set 
independently of each other.
nit: I think you can delete this part


http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h@465
PS16, Line 465: /
nit: we usually use /// for doc comments.


http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc@358
PS16, Line 358: /
nit: too many /



--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 14:02:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663: Expose current DDL metrics on WebUI
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5421/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663: Expose current DDL metrics on WebUI
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI

2020-02-27 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663: Expose current DDL metrics on WebUI
..


Patch Set 14: Code-Review+2

Thanks for applying the changes!


--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 14
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 13:47:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-27 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 14
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 13:20:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5357/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 14
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 11:48:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 12:

Restarted the build as the failure was unrelated.


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Feb 2020 11:31:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6689: Speed up point lookup for Kudu primary key

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15250 )

Change subject: IMPALA-6689: Speed up point lookup for Kudu primary key
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5420/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4631cd4d1a528a1152b5cdcb268426f2ba1a0c08
Gerrit-Change-Number: 15250
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Feb 2020 11:30:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15297 )

Change subject: IMPALA-9426 Download Python dependencies even skipping 
bootstrap toolchain
..


Patch Set 1:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5419/


--
To view, visit http://gerrit.cloudera.org:8080/15297
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I012314793ffb521001951ab7ec3d7a3ba737c405
Gerrit-Change-Number: 15297
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Feb 2020 11:23:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-27 Thread Norbert Luksa (Code Review)
Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc@180
PS12, Line 180: >(
> I think that >= offsets.size() - 1 is needed, because of the offsets[index
Done


http://gerrit.cloudera.org:8080/#/c/15051/12/be/src/exec/orc-column-readers.cc@184
PS12, Line 184: src_ptr = blob_ + offsets[index];
  : src_len = offsets[index + 1] - offsets[index];
> I think that we cannot trust completely in the length values at the moment
Thanks Csaba for investigating, uploaded a PS for the above check, will open 
ORC jiras for both of your comments.



--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 14
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 11:02:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-27 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..

IMPALA-9226: Improve string allocations of the ORC scanner

Currently the OrcColumnReader copies values from the
orc::StringVectorBatch one-by-one. Since ORC 1.6, the blob which
contains the pointed values is moved to the StringVectorBatch,
so we can copy it.

This commit beside the above improvement also enables the
LazyEncoding option for the ORC reader. This way, for stripes
with DICTIONARY_ENCODING[_V2], EncodedStringVectorBatch contains
the data in a dictionaryBlob from which the data can be acquired
with the given indices and lengths.

Tests:
 * Run ORC scanner tests (query_tests/test_scanners.py::TestOrc)
   and tpch query tests.
 * Tested performance on tpch.lineitem table with scale=25,
   running queries that selects min of string columns.
   Some results:
   col_name | encoding | before | after | speedup
   =
   l_comment  DIRECT 16.42s   14.38s  14%
   l_shipinstruct DICTIONARY 5.26s3.80s   32%
   l_commitdate   DICTIONARY 5.46s5.19s   5%
   all string col BOTH   39.06s   32.18s  21%

   The queries were run on a desktop PC with MT_DOP and NUM_NODES
   set to 1.
 * Also run TPC-H queries on the TPC-H benchmark where some
   queries' runtime improved by around 10-15%, while there were
   no regression for the others.

Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
4 files changed, 135 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/15051/14
--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 14
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5356/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 09:54:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-27 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 16:

Rebased and conflicts resolved.


--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Feb 2020 09:08:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-27 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..

WIP: Asynchronous code generation

This commit introduces optional asynchronous code generation.

Asynchronous code generation means that instead of waiting for codegen
to finish, the query starts in interpreted mode while codegen is done on
another thread.

All the function pointers that point to codegen'd functions are changed
to be atomic, wrapped in a CodegenFnPtr. These are initialised to
nullptr and as long as they are nullptr, the corresponding interpreted
functions are used (as before). When code generation is ready, the
funtion pointers are set by the codegen thread. No synchronisation is
needed as the function pointers are atomic and it is not a problem if,
at a given moment, only a subset of the codegen'd function pointers are
set and the rest are interpreted.

Asynchronous code generation can be turned on using the ASYNC_CODEGEN
boolean query option.

TODO: The default should be synchronous codegen for now.
TODO: Testing.
TODO: Benchmarks.

Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
---
M be/src/benchmarks/hash-benchmark.cc
A be/src/codegen/codegen-fn-ptr.h
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/non-grouping-aggregator.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/select-node.cc
M be/src/exec/select-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M be/src/exprs/scalar-expr.inline.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/tuple-row-compare.cc
M be/src/util/tuple-row-compare.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_queries.py
46 files changed, 521 insertions(+), 229 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/16
--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6360: Don't show full query statement on Impala webUI by default. Added the ‘query stmt size’ flag to impala-server.cc with default value of 250 and modified the ‘ImpalaHttpHand

2020-02-27 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15288 )

Change subject: IMPALA-6360: Don't show full query statement on Impala webUI by 
default. Added the ‘query_stmt_size’ flag to impala-server.cc with default 
value of 250 and modified the ‘ImpalaHttpHandler::QueryStateToJson()’ to 
truncate the end of the statements if they
..


Patch Set 4:

(4 comments)

Hi Tamas, thank you for the change.
Just a few nits.

http://gerrit.cloudera.org:8080/#/c/15288/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15288/4//COMMIT_MSG@7
PS4, Line 7: IMPALA-6360: Don't show full query statement on Impala webUI by 
default.
   : Added the ‘query_stmt_size’ flag to impala-server.cc with default 
value
nit: missing blank line between subject and body
nit: period is not needed at the end of subject


http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-http-handler.cc@374
PS4, Line 374:   Value stmt((FLAGS_query_stmt_size) ? (tmp_stmt.length() > 
FLAGS_query_stmt_size) ?
 :   tmp_stmt.substr(0, 
FLAGS_query_stmt_size).append("...").c_str() : tmp_stmt.c_str() :
 :   tmp_stmt.c_str(), document->GetAllocator());
This is a bit hard to read at first, if we would change the first ternary 
operator to if then the code would better document itself and the comment could 
be removed, similar to line 401 in this file.


http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/15288/4/be/src/service/impala-server.cc@151
PS4, Line 151: has
nit: have


http://gerrit.cloudera.org:8080/#/c/15288/4/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/15288/4/tests/webserver/test_web_pages.py@425
PS4, Line 425: lacus at risus bibendum, id pulvinar ligula lobortis. Fusce 
lacinia nibh in
 : volutpat iaculis. Cras vite dignissim ligula. Fusce 
sollici.Proin bibendum erat
 : eu libero iaculis pharetra. Duis efficitur lacus at risus 
bibendum, id pulvinar
 : ligula lobortis. Fusce lacinia nibh in volutpat iaculis. Cras 
vite dignissim
 : ligula. Fusce sollici.\
nit: missing identation



--
To view, visit http://gerrit.cloudera.org:8080/15288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7109a0be5d1022b4f8d6e72441cf5dc1dc42605
Gerrit-Change-Number: 15288
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Thu, 27 Feb 2020 08:41:03 +
Gerrit-HasComments: Yes