[Impala-ASF-CR] IMPALA-6966: sort table memory by size in catalogd web UI

2018-05-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10292


Change subject: IMPALA-6966: sort table memory by size in catalogd web UI
..

IMPALA-6966: sort table memory by size in catalogd web UI

This patch fix the sorting order in "Top-K Tables with Highest
Memory Requirements" in which "Estimated memory" column is sorted
as strings.

Values got from the catalog-server are changed from pretty-printed
strings to bytes numbers. So the web UI is able to sort and render
them correctly.

Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
---
M be/src/catalog/catalog-server.cc
M www/catalog.tmpl
2 files changed, 17 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/10292/1
--
To view, visit http://gerrit.cloudera.org:8080/10292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Gerrit-Change-Number: 10292
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-4123 (prep): Parquet column reader cleanup

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9799 )

Change subject: IMPALA-4123 (prep): Parquet column reader cleanup
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2403/


--
To view, visit http://gerrit.cloudera.org:8080/9799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1
Gerrit-Change-Number: 9799
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 15:22:48 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-6560: fix regression test for IMPALA-2376

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10272


Change subject: IMPALA-6560: fix regression test for IMPALA-2376
..

IMPALA-6560: fix regression test for IMPALA-2376

The test is modified to increase the size of collections allocated.
num_nodes and mt_dop query options are set to make execution as
deterministic as possible.

I looped the test overnight to try to flush out flakiness.

Adds support for row_regex lines in CATCH sections so that we can
match a larger part of the error message.

Change-Id: I024cb6b57647902b1735defb885cd095fd99738c
Reviewed-on: http://gerrit.cloudera.org:8080/9681
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M tests/common/impala_test_suite.py
2 files changed, 23 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/10272/2
--
To view, visit http://gerrit.cloudera.org:8080/10272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I024cb6b57647902b1735defb885cd095fd99738c
Gerrit-Change-Number: 10272
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-6587: free buffers before ScanRange::Cancel() returns

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10271


Change subject: IMPALA-6587: free buffers before ScanRange::Cancel() returns
..

IMPALA-6587: free buffers before ScanRange::Cancel() returns

ScanRange::Cancel() now waits until an in-flight read finishes so
that the disk I/O buffer being processed by the disk thread is
freed when Cancel() returns.

The fix is to set a 'read_in_flight_' flag on the scan range
while the disk thread is doing the read. Cancel() blocks until
read_in_flight_ == false.

The code is refactored to move more logic into ScanRange and
to avoid holding RequestContext::lock_ for longer than necessary.

Testing:
Added query test that reproduces the issue.

Added a unit test and a stress option that reproduces the problem in a
targeted way.

Ran disk-io-mgr-stress test for a few hours. Ran it under TSAN and
inspected output to make sure there were no non-benign data races.

Change-Id: I87182b6bd51b5fb0b923e7e4c8d08a44e7617db2
Reviewed-on: http://gerrit.cloudera.org:8080/9680
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M be/src/common/global-flags.cc
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
11 files changed, 288 insertions(+), 174 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/10271/2
--
To view, visit http://gerrit.cloudera.org:8080/10271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I87182b6bd51b5fb0b923e7e4c8d08a44e7617db2
Gerrit-Change-Number: 10271
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-6679,IMPALA-6678: reduce scan reservation

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10273 )

Change subject: IMPALA-6679,IMPALA-6678: reduce scan reservation
..

IMPALA-6679,IMPALA-6678: reduce scan reservation

This has two related changes.

IMPALA-6679: defer scanner reservation increases

When starting each scan range, check to see how big the initial scan
range is (the full thing for row-based formats, the footer for
Parquet) and determine whether more reservation would be useful.

For Parquet, base the ideal reservation on the actual column layout
of each file. This avoids reserving memory that we won't use for
the actual files that we're scanning. This also avoid the need to
estimate ideal reservation in the planner.

We also release scanner thread reservations above the minimum as
soon as threads complete, so that resources can be released slightly
earlier.

IMPALA-6678: estimate Parquet column size for reservation
-
This change also reduces reservation computed by the planner in certain
cases by estimating the on-disk size of column data based on stats. It
also reduces the default per-column reservation to 4MB since it appears
that < 8MB columns are generally common in practice and the method for
estimating column size is biased towards over-estimating. There are two
main cases to consider for the performance implications:
* Memory is available to improve query perf - if we underestimate, we
  can increase the reservation so we can do "efficient" 8MB I/Os for
  large columns.
* The ideal reservation is not available - query performance is affected
  because we can't overlap I/O and compute as much and may do smaller
  (probably 4MB I/Os). However, we should avoid pathological behaviour
  like tiny I/Os.

When stats are not available, we just default to reserving 4MB per
column, which typically is more memory than required. When stats are
available, the memory required can be reduced below when some heuristic
tell us with high confidence that the column data for most or all files
is smaller than 4MB.

The stats-based heuristic could reduce scan performance if both the
conservative heuristics significantly underestimate the column size
and memory is constrained such that we can't increase the scan
reservation at runtime (in which case the memory might be used by
a different operator or scanner thread).

Observability:
Added counters to track when threads were not spawned due to reservation
and to track when reservation increases are requested and denied. These
allow determining if performance may have been affected by memory
availability.

Testing:
Updated test_mem_usage_scaling.py memory requirements and added steps
to regenerate the requirements. Loops test for a while to flush out
flakiness.

Added targeted planner and query tests for reservation calculations and
increases.

Change-Id: Ifc80e05118a9eef72cac8e2308418122e3ee0842
Reviewed-on: http://gerrit.cloudera.org:8080/9757
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
Reviewed-on: http://gerrit.cloudera.org:8080/10273
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/bin/compute-table-stats.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest

[Impala-ASF-CR](2.x) IMPALA-6587: free buffers before ScanRange::Cancel() returns

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10271 )

Change subject: IMPALA-6587: free buffers before ScanRange::Cancel() returns
..

IMPALA-6587: free buffers before ScanRange::Cancel() returns

ScanRange::Cancel() now waits until an in-flight read finishes so
that the disk I/O buffer being processed by the disk thread is
freed when Cancel() returns.

The fix is to set a 'read_in_flight_' flag on the scan range
while the disk thread is doing the read. Cancel() blocks until
read_in_flight_ == false.

The code is refactored to move more logic into ScanRange and
to avoid holding RequestContext::lock_ for longer than necessary.

Testing:
Added query test that reproduces the issue.

Added a unit test and a stress option that reproduces the problem in a
targeted way.

Ran disk-io-mgr-stress test for a few hours. Ran it under TSAN and
inspected output to make sure there were no non-benign data races.

Change-Id: I87182b6bd51b5fb0b923e7e4c8d08a44e7617db2
Reviewed-on: http://gerrit.cloudera.org:8080/9680
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
Reviewed-on: http://gerrit.cloudera.org:8080/10271
---
M be/src/common/global-flags.cc
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
11 files changed, 288 insertions(+), 174 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/10271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I87182b6bd51b5fb0b923e7e4c8d08a44e7617db2
Gerrit-Change-Number: 10271
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-4835: switch I/O buffers to buffer pool

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10263


Change subject: IMPALA-4835: switch I/O buffers to buffer pool
..

IMPALA-4835: switch I/O buffers to buffer pool

This is the following squashed patches that were reverted.

I will fix the known issues with some follow-on patches.

==
IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify the interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

==
IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See IMPALA-6564/IMPALA-6588. I changed the logic to not try to issue
ranges for empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

==
IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem

[Impala-ASF-CR](2.x) IMPALA-4835: switch I/O buffers to buffer pool

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10263 )

Change subject: IMPALA-4835: switch I/O buffers to buffer pool
..

IMPALA-4835: switch I/O buffers to buffer pool

This is the following squashed patches that were reverted.

I will fix the known issues with some follow-on patches.

==
IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify the interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

==
IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See IMPALA-6564/IMPALA-6588. I changed the logic to not try to issue
ranges for empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

==
IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how

[Impala-ASF-CR](2.x) IMPALA-6560: fix regression test for IMPALA-2376

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10272 )

Change subject: IMPALA-6560: fix regression test for IMPALA-2376
..

IMPALA-6560: fix regression test for IMPALA-2376

The test is modified to increase the size of collections allocated.
num_nodes and mt_dop query options are set to make execution as
deterministic as possible.

I looped the test overnight to try to flush out flakiness.

Adds support for row_regex lines in CATCH sections so that we can
match a larger part of the error message.

Change-Id: I024cb6b57647902b1735defb885cd095fd99738c
Reviewed-on: http://gerrit.cloudera.org:8080/9681
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
Reviewed-on: http://gerrit.cloudera.org:8080/10272
---
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M tests/common/impala_test_suite.py
2 files changed, 23 insertions(+), 11 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/10272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I024cb6b57647902b1735defb885cd095fd99738c
Gerrit-Change-Number: 10272
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart

2018-05-03 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10289 )

Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog 
restart
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10289/1/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

http://gerrit.cloudera.org:8080/#/c/10289/1/common/thrift/StatestoreService.thrift@128
PS1, Line 128:   7: optional bool clear_topic_entries
Isn't this what 'is_delta=false' should do?



--
To view, visit http://gerrit.cloudera.org:8080/10289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6
Gerrit-Change-Number: 10289
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Thu, 03 May 2018 16:15:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 7:

(1 comment)

Thanks for applying the changes!

http://gerrit.cloudera.org:8080/#/c/10165/6/be/src/exec/hdfs-plugin-text-scanner.cc
File be/src/exec/hdfs-plugin-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/10165/6/be/src/exec/hdfs-plugin-text-scanner.cc@42
PS6, Line 42:
:
> I briefly summarized what the plugin does (versus the builtin text parsing)
I was only thinking about the format of the option string.

When I see a list parameter, I'm always wondering how to specify it, e.g. what 
separator character should I use. That's why I said that maybe we could mention 
in the help text, that it is comma-separated.

If you think it's not necessary because we always use comma-separated lists, 
then that's OK.



--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 03 May 2018 16:49:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6507: remove --disable mem pools debug feature

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10294


Change subject: IMPALA-6507: remove --disable_mem_pools debug feature
..

IMPALA-6507: remove --disable_mem_pools debug feature

Save some maintenance overhead by simplifying memory
allocation code paths. ASAN poisoning provides the same general
functionality and is on by default.

Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
---
M be/src/common/global-flags.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/free-pool.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/mem-pool.cc
5 files changed, 4 insertions(+), 41 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/10294/1
--
To view, visit http://gerrit.cloudera.org:8080/10294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Gerrit-Change-Number: 10294
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR](2.x) ignore decimal v2 doc commit

2018-05-03 Thread Michael Brown (Code Review)
Michael Brown has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10295


Change subject: ignore decimal_v2 doc commit
..

ignore decimal_v2 doc commit

167ed627febe5a10d8a4a7474a34efde1ac6f1c2 should have had the "not for
2.x" header. Ignore the commit.

Testing:

python -m json.tool

Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/10295/1
--
To view, visit http://gerrit.cloudera.org:8080/10295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
Gerrit-Change-Number: 10295
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-03 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10165

to look at the new patch set (#8).

Change subject: IMPALA-6941: load more text scanner compression plugins
..

IMPALA-6941: load more text scanner compression plugins

Add extensions for LZ4 and ZSTD (which are supported by Hadoop).
Even without a plugin this results in better behaviour because
we don't try to treat the files with unknown extensions as
uncompressed text.

Also allow loading tables containing files with unsupported
compression types. There was weird behaviour before we knew
of the file extension but didn't support querying the table -
the catalog would load the table but the impalad would fail
processing the catalog update. The simplest way to fix it
is to just allow loading the tables.

Switch to always checking plugin version - running mismatched plugin
is inherently unsafe.

Testing:
Positive case where LZO is loaded is exercised. Added
coverage for negative case where LZO is disabled.

Fixed test gaps:
* Querying LZO table with LZO plugin not available.
* Interacting with tables with known but unsupported text
  compressions.
* Querying files with unknown compression suffixes (which are
  treated as uncompressed text).

Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
---
M be/src/common/global-flags.cc
M be/src/exec/CMakeLists.txt
D be/src/exec/hdfs-lzo-text-scanner.cc
D be/src/exec/hdfs-lzo-text-scanner.h
A be/src/exec/hdfs-plugin-text-scanner.cc
A be/src/exec/hdfs-plugin-text-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M be/src/util/backend-gflag-util.cc
M common/fbs/CatalogObjects.fbs
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A testdata/workloads/functional-query/queries/QueryTest/disable-lzo-plugin.test
A 
testdata/workloads/functional-query/queries/QueryTest/unsupported-compression-partitions.test
A tests/custom_cluster/test_scanner_plugin.py
M tests/metadata/test_partition_metadata.py
20 files changed, 494 insertions(+), 239 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10165/8
--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10165/6/be/src/exec/hdfs-plugin-text-scanner.cc
File be/src/exec/hdfs-plugin-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/10165/6/be/src/exec/hdfs-plugin-text-scanner.cc@42
PS6, Line 42: "(Advanced) whitelist of HDFS "
: "text scanner plugins that Impala will try to dynamically 
load."
> I was only thinking about the format of the option string.
Oh i totally overcomplicated that. Documented the expected format.



--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 03 May 2018 17:24:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) ignore decimal v2 doc commit

2018-05-03 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10295 )

Change subject: ignore decimal_v2 doc commit
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
Gerrit-Change-Number: 10295
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Comment-Date: Thu, 03 May 2018 17:34:33 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-6908: IsConnResetTException() should include ECONNRESET

2018-05-03 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10265

to look at the new patch set (#3).

Change subject: IMPALA-6908: IsConnResetTException() should include ECONNRESET
..

IMPALA-6908: IsConnResetTException() should include ECONNRESET

The utility function IsConnResetTException() attempted to match error
strings from RPCs that fail due to the remote end resetting the connection
for any reason. However, it did not take care of the situation where
ECONNRESET was sent on a cluster not using TLS.

This patch adds this case as well and adds a custom cluster fault
injection test to test the same.

Change-Id: I1bb997a833917e5166c9ca192da219f50f4439e2
---
M be/src/rpc/thrift-util.cc
M be/src/testutil/fault-injection-util.cc
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_exception.py
4 files changed, 31 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10265/3
--
To view, visit http://gerrit.cloudera.org:8080/10265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1bb997a833917e5166c9ca192da219f50f4439e2
Gerrit-Change-Number: 10265
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR](2.x) IMPALA-6908: IsConnResetTException() should include ECONNRESET

2018-05-03 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10265 )

Change subject: IMPALA-6908: IsConnResetTException() should include ECONNRESET
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10265/1/be/src/rpc/thrift-util.cc
File be/src/rpc/thrift-util.cc:

http://gerrit.cloudera.org:8080/#/c/10265/1/be/src/rpc/thrift-util.cc@209
PS1, Line 209: strncmp(PACKAGE_VERSION, "0.9.0", 5) =
> Oh missed that earlier comment. We had some static_assert() from line 63-68
Done


http://gerrit.cloudera.org:8080/#/c/10265/1/be/src/rpc/thrift-util.cc@209
PS1, Line 209: strncmp(PACKAGE_VERSION, "0.9.0", 5) =
> How did the test pass to begin with ? Also, not sure if it warrants a strnc
I forgot to run it when I added the first strncmp() patchset, but it worked on 
the patchset before and after that one.



--
To view, visit http://gerrit.cloudera.org:8080/10265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I1bb997a833917e5166c9ca192da219f50f4439e2
Gerrit-Change-Number: 10265
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 03 May 2018 17:41:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6946: handle negative counts in RLE decoder

2018-05-03 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10233

to look at the new patch set (#4).

Change subject: IMPALA-6946: handle negative counts in RLE decoder
..

IMPALA-6946: handle negative counts in RLE decoder

This improves the handling of out-of-range values to avoid hitting various
DCHECKs, including the one in the JIRA. repeat_count_ and literal_count_
are int32_ts. Avoid setting them to a negative value directly or by
integer overflow.

Switch to using uint32_t for "VLQ" encoding, which should be ULEB-128
encoding according to the Parquet standard. This fixes an infinite loop
in PutVlqInt() for negative values - the bug was that shifting right
sign-extended the signed value.

Testing:
Added backend test to exercise handling of large literal and repeat
counts that don't fit in an int32_t. Before these fixes it could trigger
several DCHECKs.

Change-Id: If75ef3fb12494209918c100f26407cd93b17addb
---
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
5 files changed, 83 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/10233/4
--
To view, visit http://gerrit.cloudera.org:8080/10233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If75ef3fb12494209918c100f26407cd93b17addb
Gerrit-Change-Number: 10233
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6946: handle negative counts in RLE decoder

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10233 )

Change subject: IMPALA-6946: handle negative counts in RLE decoder
..


Patch Set 3:

My clarification to the parquet standard looks like it will be approved so 
updating the comment.


--
To view, visit http://gerrit.cloudera.org:8080/10233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If75ef3fb12494209918c100f26407cd93b17addb
Gerrit-Change-Number: 10233
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 17:49:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6946: handle negative counts in RLE decoder

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10233 )

Change subject: IMPALA-6946: handle negative counts in RLE decoder
..


Patch Set 4: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/10233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If75ef3fb12494209918c100f26407cd93b17addb
Gerrit-Change-Number: 10233
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 17:49:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4123 (prep): Parquet column reader cleanup

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9799 )

Change subject: IMPALA-4123 (prep): Parquet column reader cleanup
..


Patch Set 7: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/9799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1
Gerrit-Change-Number: 9799
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 17:50:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-03 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 2:

(55 comments)

http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@150
PS2, Line 150: void AddTestDataDateTimes(vector& data, int n, 
const string& startstr) {
> Since it's in a benchmark it doesn't really matter, anyway, some nit commen
Done


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@172
PS2, Line 172: TestData
> This looks like a fairly general class to me that could move to util/benchm
'measure_multithreaded_elapsed_time' function is not that general, it is used 
here as a quick and dirty way to verify that glibc calls are executed in a 
serial fashion even in a multithreaded environment.  Because of that I would 
like to keep this class here.


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@174
PS2, Line 174: > >
> Nit: since C++11 you don't need to put spaces between right angle brackets.
Done


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@228
PS2, Line 228: const shared_ptr > data_
> Nit: I think using 'const vector&' would be simpler. I don't really s
True, I rewrote this portion of the code so many times that I lost track of 
where I was going with it :)


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@319
PS2, Line 319: boost_throw_if_date_out_of_range(local_time.date());
> Is this function call really needed here? Don't we trust boost that it vali
This is how the function was implemented originally. Since the point of this 
benchmark program is to compare the old implementation with the new one, I 
figured I shouldn't change the old code.

I think the call is necessary, because boost might not validate the date range 
until you call the gregorian::date accessors. It is confusing.


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@444
PS2, Line 444: time_t utc =
> We could replace this with boost_utc_to_unix_time. This conversion should b
This is how UtcToLocal was implemented originally. The goal of this benchmark 
program is to compare the original implementation with the new one (including 
the glue code).


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@530
PS2, Line 530: //
 : // Test UnixTimeToUtcPtime (boost is expected to be faster than 
CCTZ)
 : //
 :
 : // boost
 : boost::pos
> I think that this a bit misleading, as boost_unix_time_to_utc_ptime never u
Done


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/benchmarks/convert-timestamp-benchmark.cc@608
PS2, Line 608:   if (cctz_utc_to_unix_data.get_result() != 
glibc_utc_to_unix_data.get_result()) {
 : cerr << "cctz/glibc utc_to_unix results do not match!" << 
endl;
 : return 1;
 :   }
 :   if (boost_utc_to_unix_data.get_result() != 
glibc_utc_to_unix_data.get_result()) {
 : cerr << "boost/glibc utc_to_unix results do not match!" << 
endl;
 : return 1;
 :   }
> The other benchmarks don't need this validity check?
Done (although passing a vector of TestData to the helper function was not 
feasible as the different TestData classes are instantiated with a different 
converter functions, therefore they are not "compatible")


http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exec/data-source-scan-node.h
File be/src/exec/data-source-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exec/data-source-scan-node.h@100
PS1, Line 100:   Status MaterializeNextRow(RuntimeState* state, MemPool* 
mem_pool, Tuple* tuple);
> What do you think about passing cctz::time_zone* instead of RuntimeState*?
Done


http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/exprs/decimal-operators.h
File be/src/exprs/decimal-operators.h:

http://gerrit.cloudera.org:8080/#/c/9986/2/be/src/exprs/decimal-operators.h@168
PS2, Line 168:   /// local time in 'local_tz' time-zone). Rounds instead of 
truncating if 'round' is true.
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/expr-test.cc@6398
PS1, Line 6398: const char* local_tz_name = "PST8PDT";
  : ScopedTimeZoneOverride time_zone(local_tz_name);
  : const cctz::time_zone* local_tz = 
TimezoneDatabase::FindTimezone(local_tz_name);
  : DCHECK(local_tz != nullptr);
> Have you c

[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-03 Thread Attila Jeges (Code Review)
Hello Gabor Kaszab, Zoltan Borok-Nagy, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9986

to look at the new patch set (#3).

Change subject: IMPALA-3307: Add support for IANA time-zone db
..

IMPALA-3307: Add support for IANA time-zone db

Impala currently uses two different libraries for timestamp
manipulations: boost and glibc.

Issues with boost:
- Time-zone database is currently hard coded in timezone_db.cc.
  Impala admins cannot update it without upgrading Impala.
- Time-zone database is flat, therefore can’t track year-to-year
  changes.
- Time-zone database is not updated on a regular basis.

Issues with glibc:
- Uses /usr/share/zoneinfo/ database which could be out of sync on
  some of the nodes in the Impala cluster.
- Uses the host system’s local time-zone. Different nodes in the
  Impala cluster might use a different local time-zone.
- Conversion functions take a global lock, which causes severe
  performance degradation.

In addition to the issues above, the fact that /usr/share/zoneinfo/
and the hard-coded boost time-zone database are both in use is a
source of inconsistency in itself.

This patch makes the following changes:
- Instead of boost and glibc, impalad uses Google's CCTZ to implement
  time-zone conversions.
- Introduces a new startup flag (--hdfs_zone_info_dir) to impalad to
  specify an HDFS/S3/ADLS location that contains the shared compiled
  IANA time-zone database. If the startup flag is set, impalad will
  use the specified time-zone database. Otherwise, impalad will use
  the default /usr/share/zoneinfo time-zone database.
- impalad reads the entire time-zone database into an in-memory
  map on startup for fast lookups.
- The name of the coordinator node’s local time-zone is saved to the
  query context when preparing query execution. This time-zone is used
  whenever the current time-zone is referred afterwards in an
  execution node.
- Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad
  to specify an HDFS/S3/ADLS path to a shared config file that
  contains definitions for non-standard time-zone abbreviations.

Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/common/global-types.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/exprs/expr-test.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
A be/src/exprs/timezone_db-test.cc
M be/src/exprs/timezone_db.cc
M be/src/exprs/timezone_db.h
M be/src/runtime/raw-value-test.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/service/impala-server.cc
M be/src/service/impalad-main.cc
M be/src/util/filesystem-util-test.cc
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
M be/src/util/hdfs-util-test.cc
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
M be/src/util/time-test.cc
M be/src/util/time.cc
M be/src/util/time.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindCctz.cmake
M common/thrift/ImpalaInternalService.thrift
M common/thrift/metrics.json
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/bin/create-load-data.sh
M testdata/data/timezoneverification.csv
A testdata/tzdb/abbrev.conf
A testdata/tzdb/zoneinfo/AmerICA/ArgeNTINA/MendOZA
A testdata/tzdb/zoneinfo/AmerICA/CancUN
A testdata/tzdb/zoneinfo/UTC
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
A tests/custom_cluster/custom_tzdb.py
53 files changed, 2,542 insertions(+), 1,093 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9986/3
--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10285/3/docs/topics/impala_breakpad.xml
File docs/topics/impala_breakpad.xml:

http://gerrit.cloudera.org:8080/#/c/10285/3/docs/topics/impala_breakpad.xml@79
PS3, Line 79: of
: one or more Impala-related daemons to an empty 
string, and restart
: the corresponding services or daemons.
Can you make the two ways more consistent, e.g. make both "Set the ..." and 
then below mention that both require a restart of the relevant daemons.



--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 18:17:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

2018-05-03 Thread Alan Choi (Code Review)
Alan Choi has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10284 )

Change subject: IMPALA-6959: [DOCS] Update to HAProxy configuration sample code
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Gerrit-Change-Number: 10284
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alan Choi 
Gerrit-Comment-Date: Thu, 03 May 2018 18:18:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Alex Rodoni (Code Review)
Hello Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10285

to look at the new patch set (#4).

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..

IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable minidumps

Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
---
M docs/topics/impala_breakpad.xml
1 file changed, 20 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/10285/4
--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10285/3/docs/topics/impala_breakpad.xml
File docs/topics/impala_breakpad.xml:

http://gerrit.cloudera.org:8080/#/c/10285/3/docs/topics/impala_breakpad.xml@79
PS3, Line 79: of
: one or more Impala-related daemons to an empty 
string, and restart
: the corresponding services or daemons.
> Can you make the two ways more consistent, e.g. make both "Set the ..." and
Done



--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 18:22:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

2018-05-03 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10284 )

Change subject: IMPALA-6959: [DOCS] Update to HAProxy configuration sample code
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Gerrit-Change-Number: 10284
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alan Choi 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Comment-Date: Thu, 03 May 2018 18:22:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10284 )

Change subject: IMPALA-6959: [DOCS] Update to HAProxy configuration sample code
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/278/


--
To view, visit http://gerrit.cloudera.org:8080/10284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Gerrit-Change-Number: 10284
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alan Choi 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 03 May 2018 18:23:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10284 )

Change subject: IMPALA-6959: [DOCS] Update to HAProxy configuration sample code
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Gerrit-Change-Number: 10284
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alan Choi 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 03 May 2018 18:33:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10284 )

Change subject: IMPALA-6959: [DOCS] Update to HAProxy configuration sample code
..

IMPALA-6959: [DOCS] Update to HAProxy configuration sample code

- Changed to deprecated timeouts: contimeout, clitimeout, srvtimeout
- Changed the sample timeout values to more realistic values
- Added a note that actual timeout values should depend on
  the user cluster

Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Reviewed-on: http://gerrit.cloudera.org:8080/10284
Reviewed-by: Alan Choi 
Reviewed-by: Alex Rodoni 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_proxy.xml
1 file changed, 6 insertions(+), 3 deletions(-)

Approvals:
  Alan Choi: Looks good to me, but someone else must approve
  Alex Rodoni: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idff3aa9bbb58c1953cb7e9394ade01c7833c3a34
Gerrit-Change-Number: 10284
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alan Choi 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR](2.x) ignore decimal v2 doc commit

2018-05-03 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10295 )

Change subject: ignore decimal_v2 doc commit
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
Gerrit-Change-Number: 10295
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 03 May 2018 18:39:20 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) ignore decimal v2 doc commit

2018-05-03 Thread Michael Brown (Code Review)
Michael Brown has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10295 )

Change subject: ignore decimal_v2 doc commit
..

ignore decimal_v2 doc commit

167ed627febe5a10d8a4a7474a34efde1ac6f1c2 should have had the "not for
2.x" header. Ignore the commit.

Testing:

python -m json.tool

Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
Reviewed-on: http://gerrit.cloudera.org:8080/10295
Reviewed-by: Alex Rodoni 
Tested-by: Michael Brown 
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Alex Rodoni: Looks good to me, approved
  Michael Brown: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia5be1f9bdf3440b7a7baafbc048fa871040feafd
Gerrit-Change-Number: 10295
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-03 Thread Pranay Singh (Code Review)
Pranay Singh has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 15:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/exec/hdfs-rcfile-scanner.cc
File be/src/exec/hdfs-rcfile-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/exec/hdfs-rcfile-scanner.cc@91
PS15, Line 91: excceeded
> typo: exceeded
Done


http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/exec/hdfs-rcfile-scanner.cc@545
PS15, Line 545: if (max_tuples < 0) {
> It seems like this is catching the problem too late. How does it get into t
Done


http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/runtime/timestamp-parse-util.cc@370
PS15, Line 370:   if (UNLIKELY(str == NULL || len <= 0)) {
> We're already checking len here but the problem is that we're trimming whit
Introduced a DCHECK_GT(len, 0) and a check in the caller.


http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/runtime/timestamp-parse-util.cc@477
PS15, Line 477: if (lazy_len > 0 && ParseFormatTokensByStr(&lazy_ctx)) {
> Looks like you hit upon a different bug here - I can trigger that DCHECK wi
I have removed the check



--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 15
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 03 May 2018 18:44:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-03 Thread Pranay Singh (Code Review)
Hello anujphadke, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8936

to look at the new patch set (#16).

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..

IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

Introduced new error message when scanning a corrupt Sequence or RCFile.
Added new checks to detect buffer overrun while handling Sequence or RCFile.

Testing:
  a) Made changes to fuzz test for RCFile/Sequence file, ran fuzz test in a loop
  with 200 iteration without failure.

  b) Ran exhaustive test on the changes without failure.

Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
---
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/exec/scanner-context.inline.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/util/decompress.cc
M tests/query_test/test_scanners_fuzz.py
10 files changed, 259 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/8936/16
--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 16
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-03 Thread Pranay Singh (Code Review)
Pranay Singh has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/exec/hdfs-rcfile-scanner.cc
File be/src/exec/hdfs-rcfile-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8936/15/be/src/exec/hdfs-rcfile-scanner.cc@588
PS15, Line 588: }
> I think we should catch the invalid column metadata when we read it in GetC
Since the end of column is calculated here I am checking it here.



--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 16
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 03 May 2018 18:55:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6866: Rework timeouts for test exchange delays.py

2018-05-03 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10208 )

Change subject: IMPALA-6866: Rework timeouts for test_exchange_delays.py
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10208/1/tests/custom_cluster/test_exchange_delays.py
File tests/custom_cluster/test_exchange_delays.py:

http://gerrit.cloudera.org:8080/#/c/10208/1/tests/custom_cluster/test_exchange_delays.py@a29
PS1, Line 29:
> Removing these two means test_exchange_small_delay and test_exchange_large_
This is intentional. The symptom that caused us to fix IMPALA-6811 had to do 
with the zero rows case. The other tests were working fine with their existing 
timeouts. Splitting this out allows the zero rows case to get the timeout it 
needs and these can go back to what they were.



--
To view, visit http://gerrit.cloudera.org:8080/10208
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e919a4e502b1e6a4156aafbbe4b5ddfe679ed89
Gerrit-Change-Number: 10208
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 03 May 2018 18:58:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-03 Thread Pranay Singh (Code Review)
Hello anujphadke, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8936

to look at the new patch set (#17).

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..

IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

Introduced new error message when scanning a corrupt Sequence or RCFile.
Added new checks to detect buffer overrun while handling Sequence or RCFile.

Testing:
  a) Made changes to fuzz test for RCFile/Sequence file, ran fuzz test in a loop
  with 200 iteration without failure.

  b) Ran exhaustive test on the changes without failure.

Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
---
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/exec/scanner-context.inline.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/util/decompress.cc
M tests/query_test/test_scanners_fuzz.py
10 files changed, 258 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/8936/17
--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 17
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6866: Rework timeouts for test exchange delays.py

2018-05-03 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10208 )

Change subject: IMPALA-6866: Rework timeouts for test_exchange_delays.py
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10208
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e919a4e502b1e6a4156aafbbe4b5ddfe679ed89
Gerrit-Change-Number: 10208
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 03 May 2018 19:02:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart

2018-05-03 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10289 )

Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog 
restart
..


Patch Set 1:

(1 comment)

still looking ...

http://gerrit.cloudera.org:8080/#/c/10289/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10289/1//COMMIT_MSG@22
PS1, Line 22: full topic update
Prior to 5058, if coordinators would receive a delta instead of full update for 
the catalog restart case, then this case will add additional load (network, 
etc.). Correct?



--
To view, visit http://gerrit.cloudera.org:8080/10289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6
Gerrit-Change-Number: 10289
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 19:17:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123 (prep): Parquet column reader cleanup

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9799 )

Change subject: IMPALA-4123 (prep): Parquet column reader cleanup
..

IMPALA-4123 (prep): Parquet column reader cleanup

Some miscellaneous cleanup to make it easier to understand and
make future changes to the Parquet scanner.

A lot of the refactoring is about more cleanly separating functions
so that they have clearer purpose, e.g.:
* Functions that strictly do decoding, i.e. materialize values, convert
  and validate them. These are changed to operate on single values, not tuples.
* Functions that are used for the non-batched decoding path (i.e. driven
  by CollectionColumnReader or BoolColumnReader).
* Functions that dispatch to a templated implementation based on one or
  more runtime values.

Other misc changes:
* Move large functions out of class bodies.
* Use parquet::Encoding instead of bool to indicate encoding.
* Add some additional DCHECKs.

Testing:
* Ran exhaustive tests
* Ran fuzz test in a loop

Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1
Reviewed-on: http://gerrit.cloudera.org:8080/9799
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/tuple.h
M be/src/util/bit-stream-utils.h
M be/src/util/rle-encoding.h
6 files changed, 482 insertions(+), 398 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1
Gerrit-Change-Number: 9799
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4123 (prep): Parquet column reader cleanup

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9799 )

Change subject: IMPALA-4123 (prep): Parquet column reader cleanup
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1
Gerrit-Change-Number: 9799
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 19:18:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart

2018-05-03 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10289 )

Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog 
restart
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10289/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10289/1//COMMIT_MSG@22
PS1, Line 22: full topic update
> Prior to 5058, if coordinators would receive a delta instead of full update
I don't think so because the "full topic update" after a catalogd restart only 
contains table skeletons and no partition/block information.



--
To view, visit http://gerrit.cloudera.org:8080/10289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6
Gerrit-Change-Number: 10289
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 19:33:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6949: Add the option to start the minicluster with EC enabled

2018-05-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10275 )

Change subject: IMPALA-6949: Add the option to start the minicluster with EC 
enabled
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh
File testdata/bin/setup-hdfs-env.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh@80
PS3, Line 80: HDFS_EC_PATH
This should be HDFS_ERASURECODE_PATH



--
To view, visit http://gerrit.cloudera.org:8080/10275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Gerrit-Change-Number: 10275
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 03 May 2018 20:35:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10285/4/docs/topics/impala_breakpad.xml
File docs/topics/impala_breakpad.xml:

http://gerrit.cloudera.org:8080/#/c/10285/4/docs/topics/impala_breakpad.xml@80
PS4, Line 80: o
typo



--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 20:36:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Alex Rodoni (Code Review)
Hello Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10285

to look at the new patch set (#5).

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..

IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable minidumps

Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
---
M docs/topics/impala_breakpad.xml
1 file changed, 20 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/10285/5
--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10285/4/docs/topics/impala_breakpad.xml
File docs/topics/impala_breakpad.xml:

http://gerrit.cloudera.org:8080/#/c/10285/4/docs/topics/impala_breakpad.xml@80
PS4, Line 80: o
> typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 20:39:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 20:39:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/279/


--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 20:43:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..

IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable minidumps

Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Reviewed-on: http://gerrit.cloudera.org:8080/10285
Reviewed-by: Lars Volker 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_breakpad.xml
1 file changed, 20 insertions(+), 4 deletions(-)

Approvals:
  Lars Volker: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6961: [DOCS] Doc --enable minidump flag to disable minidumps

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10285 )

Change subject: IMPALA-6961: [DOCS] Doc --enable_minidump flag to disable 
minidumps
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3412e36272cda0c1502d4643afcdbad01e9548a5
Gerrit-Change-Number: 10285
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 03 May 2018 20:45:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6949: Add the option to start the minicluster with EC enabled

2018-05-03 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10275 )

Change subject: IMPALA-6949: Add the option to start the minicluster with EC 
enabled
..


Patch Set 3: Code-Review+2

(5 comments)

For what you're doing right now, I think this is very close. A few nits below, 
but feel free to carry the +2 once you add the comments and one fail-fast that 
I asked for.

Do any of the 3-node EC policies make any sense to test always? i.e., we're 
creating a new test axis here, but it's nice to test at least a little bit in 
the "regular" test axis.

http://gerrit.cloudera.org:8080/#/c/10275/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10275/2/bin/impala-config.sh@451
PS2, Line 451:   if [[ "${ERASURE_CODING}" = true ]]; then
I think we should error out (or at least warn) if


http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh@450
PS3, Line 450: elif [ "${TARGET_FILESYSTEM}" == "hdfs" ]; then
nit: we seem to use single = here (line below as well as line 434, etc.), so 
may as well keep doing it?


http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh@451
PS3, Line 451:   if [[ "${ERASURE_CODING}" = true ]]; then
I think we should error out or warn here if MINICLUSTER_PROFILE < 3. I.e., we 
know this doesn't make sense for Hadoop 2, and we should error out early.


http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh
File testdata/bin/setup-hdfs-env.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh@80
PS3, Line 80: HDFS_EC_PATH
> This should be HDFS_ERASURECODE_PATH
Please add:

# This does not convert the existing files but affects new files created in 
this directory.

That said--what's the point? Does flipping back and forth make any sense or are 
users just going to get the wrong thing, because they really need to redo data 
load, and the only normal way to do that is -format?


http://gerrit.cloudera.org:8080/#/c/10275/2/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/10275/2/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl@26
PS2, Line 26: cloudera.erasure_coding.enabled
> This is needed because running the following command
Add:





--
To view, visit http://gerrit.cloudera.org:8080/10275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Gerrit-Change-Number: 10275
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 03 May 2018 21:27:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@134
PS2, Line 134: expectedPrivileges
nit: per conversation, can we rename to expectedAuthorizables? or 
expectedPrivilegeRequestNames?


http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@549
PS2, Line 549: authorize("use functional")
There should be .ok() tests for onDatabase, onTable, and onColumn.



--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Thu, 03 May 2018 21:30:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 11:

If there's a bug in sqlparse, have we filed it with sqlparse and tried to fix 
it upstream?

I think David's concern is with having patches rather than upgrading. I think 
we can upgrade if we need to. Maintaining a "vendored"/shaded copy would be 
annoying but also possible if we absolutely need the pathes.


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 21:36:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@134
PS2, Line 134: expectedAuthorizab
> nit: per conversation, can we rename to expectedAuthorizables? or expectedP
Done


http://gerrit.cloudera.org:8080/#/c/10269/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@549
PS2, Line 549: AuthzTest test = authorize(
> There should be .ok() tests for onDatabase, onTable, and onColumn.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Thu, 03 May 2018 21:44:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..

IMPALA-6802 (part 2): Clean up authorization tests

The second part of this patch is to rewrite the following
authorization tests:
- insert
- truncate
- load
- use

Testing:
- Added new authorization tests
- Ran all front-end tests

Cherry-picks: not for 2.x

Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
---
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
1 file changed, 322 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/10269/3
--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 11:

> Patch Set 11:
>
> If there's a bug in sqlparse, have we filed it with sqlparse and tried to fix 
> it upstream?
>
> I think David's concern is with having patches rather than upgrading. I think 
> we can upgrade if we need to. Maintaining a "vendored"/shaded copy would be 
> annoying but also possible if we absolutely need the pathes.

The fix is already upstream, but it's not released yet and it may require at 
minimum Python 2.7 for us to upgrade, which I don't know if it's an option for 
us. The sqlparse version that we use is pretty old.


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 21:45:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948,6962: add end-to-end tests

2018-05-03 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,6962: add end-to-end tests
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10291/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10291/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-6948,6962: add end-to-end tests
best practice is to spell out IMPALA-6962 so various tools can find it


http://gerrit.cloudera.org:8080/#/c/10291/1/tests/common/impala_service.py
File tests/common/impala_service.py:

http://gerrit.cloudera.org:8080/#/c/10291/1/tests/common/impala_service.py@132
PS1, Line 132:   def get_catalog_objects(self, excludes=['_impala_builtins']):
We also had issues with coordinator/impalad metadata coherency because of the 2 
code paths. Do we want to extend the comparison to cover that case? Fine to do 
that later.


http://gerrit.cloudera.org:8080/#/c/10291/1/tests/custom_cluster/test_metadata_replicas.py
File tests/custom_cluster/test_metadata_replicas.py:

http://gerrit.cloudera.org:8080/#/c/10291/1/tests/custom_cluster/test_metadata_replicas.py@42
PS1, Line 42: for obj in c_objects:
This is very expensive, do we really need all tables? How long does this test 
run?



--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Thu, 03 May 2018 22:17:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6507: remove --disable mem pools debug feature

2018-05-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10294 )

Change subject: IMPALA-6507: remove --disable_mem_pools debug feature
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Gerrit-Change-Number: 10294
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Thu, 03 May 2018 22:50:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6507: remove --disable mem pools debug feature

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10294 )

Change subject: IMPALA-6507: remove --disable_mem_pools debug feature
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2409/


--
To view, visit http://gerrit.cloudera.org:8080/10294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Gerrit-Change-Number: 10294
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 03 May 2018 23:08:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10301


Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..

IMPALA-6968: Fix TestBlockVerification flakiness

The bug is that the byte in the encrypted data is '?' around 1/256 runs
of the test. Instead, flip a bit in the original data so that it's
always different from the input.

Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
---
M be/src/runtime/tmp-file-mgr-test.cc
1 file changed, 14 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/10301/1
--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Tim Armstrong (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10301

to look at the new patch set (#2).

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..

IMPALA-6968: Fix TestBlockVerification flakiness

The bug is that the byte in the encrypted data is '?' around 1/256 runs
of the test. Instead, flip a bit in the original data so that it's
always different from the input.

Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
---
M be/src/runtime/tmp-file-mgr-test.cc
1 file changed, 7 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/10301/2
--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc@545
PS2, Line 545: data[0]
Just to clarify, we expect the in memory 'data' string to also be encrypted at 
this point, right?

Otherwise, we might run into the same problem as before.



--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 03 May 2018 23:20:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc@545
PS2, Line 545: data[0]
> Just to clarify, we expect the in memory 'data' string to also be encrypted
Yeah that's correct, we encrypt in-place in-memory then start the write to 
disk. RestoreData() decrypts in place.



--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 23:23:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..


Patch Set 2: Code-Review+2

(1 comment)

Great find! Thanks for fixing it so quickly.

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/10301/2/be/src/runtime/tmp-file-mgr-test.cc@545
PS2, Line 545: data[0]
> Yeah that's correct, we encrypt in-place in-memory then start the write to
Thanks for the clarification.



--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 23:23:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 11:

> Patch Set 11:
>
> > Patch Set 11:
> >
> > If there's a bug in sqlparse, have we filed it with sqlparse and tried to 
> > fix it upstream?
> >
> > I think David's concern is with having patches rather than upgrading. I 
> > think we can upgrade if we need to. Maintaining a "vendored"/shaded copy 
> > would be annoying but also possible if we absolutely need the pathes.
>
> The fix is already upstream, but it's not released yet and it may require at 
> minimum Python 2.7 for us to upgrade, which I don't know if it's an option 
> for us. The sqlparse version that we use is pretty old.

Phil's right about my concerns. I further questions would be:

- how much longer will we realistically commit ourselves to being compatible 
with python 2.6?
- if we in fact are going to bundle a "vendored" copy of sqlparse, should we 
base it on the version we assume elsewhere -- 0.1.19 (the last version to 
support python 2.6)?


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 23:28:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2410/


--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 23:30:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6946: handle negative counts in RLE decoder

2018-05-03 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10233 )

Change subject: IMPALA-6946: handle negative counts in RLE decoder
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10233/4/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/10233/4/be/src/util/rle-encoding.h@606
PS4, Line 606: vlq
nit: uleb


http://gerrit.cloudera.org:8080/#/c/10233/4/be/src/util/rle-encoding.h@625
PS4, Line 625: DCHECK_LE(run_len, std::numeric_limits::max())
Can this go to L615, right after reading run_len? If not, can you add a comment 
to explain why?


http://gerrit.cloudera.org:8080/#/c/10233/4/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/10233/4/be/src/util/rle-test.cc@514
PS4, Line 514: // IMPALA-6946 hit DCHECK because it treated -1 as a signed 
integer.
I don't understand how the comment relates to the next two lines. Can you 
clarify?



--
To view, visit http://gerrit.cloudera.org:8080/10233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If75ef3fb12494209918c100f26407cd93b17addb
Gerrit-Change-Number: 10233
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 May 2018 23:45:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 11:

> how much longer will we realistically commit ourselves to being compatible 
> with python 2.6?

Fore the foreseeable future. I'm aware of many folks using Impala on RH6 and 
friends and planning to stay there.


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 03 May 2018 23:59:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 11:

Fredy,

Can you leave pointers in the patch code to indicate that it's fixed upstream 
(and where-ish)?


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 04 May 2018 00:00:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4025: Part 1: Add percentile disc aggregation function

2018-05-03 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/9777 )

Change subject: IMPALA-4025: Part 1: Add percentile_disc aggregation function
..

IMPALA-4025: Part 1: Add percentile_disc aggregation function

This patch adds aggregation function percentile_disc. The implementation
is rewriting it into an inline view. The inline view computes the row
number on the ordering expr using analytic functions. The parent query
then picks the desired row using aggregation.
An Example of such rewrite is in StmtRewriter.java.

The behavior of this function is mostly the same as in PostgreSQL. The
handling of percentile expr not in [0, 1] is different: PostgreSQL
throws an error and impala returns NULL.

Some FE and EE tests are added. EE tests not related to the above
difference are verified against PostgreSQL.

Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
A fe/src/main/java/org/apache/impala/analysis/PercentileAggExpr.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
23 files changed, 1,261 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/9777/6
--
To view, visit http://gerrit.cloudera.org:8080/9777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e
Gerrit-Change-Number: 9777
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-4025: Part 2: Add percentile cont & median aggregation functions

2018-05-03 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/9778 )

Change subject: IMPALA-4025: Part 2: Add percentile_cont & median aggregation 
functions
..

IMPALA-4025: Part 2: Add percentile_cont & median aggregation functions

percentile_cont is implemented in the similar way as percentile_disc,
except for using a BE custom aggregation function for interpolating the
final result. median is rewritten into percentile_cont(0.5).

Some EE tests are added. Tests not related to error handling are
verified against PostgreSQL.

Change-Id: I2cc184682bb1bf4a5011b69a89e9ae253f3fd88d
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/PercentileAggExpr.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
11 files changed, 243 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/9778/5
--
To view, visit http://gerrit.cloudera.org:8080/9778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2cc184682bb1bf4a5011b69a89e9ae253f3fd88d
Gerrit-Change-Number: 9778
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on

2018-05-03 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10306


Change subject: IMPALA-6972: Disable parallel dataload on
..

IMPALA-6972: Disable parallel dataload on

There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.

This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.

IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.

Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.

Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
---
M bin/impala-config.sh
M bin/load-data.py
M testdata/bin/create-load-data.sh
3 files changed, 10 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/10306/1
--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6949: Add the option to start the minicluster with EC enabled

2018-05-03 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10275 )

Change subject: IMPALA-6949: Add the option to start the minicluster with EC 
enabled
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh@450
PS3, Line 450: elif [ "${TARGET_FILESYSTEM}" == "hdfs" ]; then
> nit: we seem to use single = here (line below as well as line 434, etc.), s
I'm trying this out and found that == don't work with zsh.



--
To view, visit http://gerrit.cloudera.org:8080/10275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Gerrit-Change-Number: 10275
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 04 May 2018 00:38:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6948,6962: add end-to-end tests

2018-05-03 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,6962: add end-to-end tests
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10291/1/tests/common/impala_service.py
File tests/common/impala_service.py:

http://gerrit.cloudera.org:8080/#/c/10291/1/tests/common/impala_service.py@132
PS1, Line 132:   def get_catalog_objects(self, excludes=['_impala_builtins']):
> We also had issues with coordinator/impalad metadata coherency because of t
currently, I just grab the version so it wouldn't cover the case you're 
referring to.
the objects come out as thrift->html or as json with the thrift as a debug 
string so I'd want to make a better api. but yes, would be good to cover those 
cases.


http://gerrit.cloudera.org:8080/#/c/10291/1/tests/custom_cluster/test_metadata_replicas.py
File tests/custom_cluster/test_metadata_replicas.py:

http://gerrit.cloudera.org:8080/#/c/10291/1/tests/custom_cluster/test_metadata_replicas.py@42
PS1, Line 42: for obj in c_objects:
> This is very expensive, do we really need all tables? How long does this te
L42-47 takes ~7-8 s. validation on L49 takes ~2-3 s.
total test takes ~19 s (starting/stopping cluster, etc).

For simplicity, I look at all objects. I can whitelist both steps if this is 
too expensive.



--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 04 May 2018 00:59:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@563
PS3, Line 563: .ok(onDatabase("functional", privilege))
 :   .ok(onTable("functional", "alltypes", privilege))
 :   .ok(onColumn("functional", "alltypes", "id", 
privilege));
I think the additional levels are unnecessary for default, but if they are then 
the privileges should be on default not functional.  Or is the thinking that 
the default privileges might not work if there are any other privilege granted?


http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@573
PS3, Line 573:   .ok(onDatabase("functional", privilege))
 :   .ok(onTable("functional", "alltypes", privilege))
 :   .ok(onColumn("functional", "alltypes", "id", 
privilege));
Same as above.



--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Fri, 04 May 2018 01:12:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6949: Add the option to start the minicluster with EC enabled

2018-05-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10275 )

Change subject: IMPALA-6949: Add the option to start the minicluster with EC 
enabled
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh@450
PS3, Line 450: elif [ "${TARGET_FILESYSTEM}" == "hdfs" ]; then
> I'm trying this out and found that == don't work with zsh.
Done


http://gerrit.cloudera.org:8080/#/c/10275/3/bin/impala-config.sh@451
PS3, Line 451:   if [[ "${ERASURE_CODING}" = true ]]; then
> I think we should error out or warn here if MINICLUSTER_PROFILE < 3. I.e.,
Done


http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh
File testdata/bin/setup-hdfs-env.sh:

http://gerrit.cloudera.org:8080/#/c/10275/3/testdata/bin/setup-hdfs-env.sh@80
PS3, Line 80: HDFS_EC_PATH
> Please add:
I agree. I don't think it makes sense to unset it here, because it's not going 
to do anything. Removed.


http://gerrit.cloudera.org:8080/#/c/10275/2/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/10275/2/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl@26
PS2, Line 26: cloudera.erasure_coding.enabled
> Add:
Done



--
To view, visit http://gerrit.cloudera.org:8080/10275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Gerrit-Change-Number: 10275
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 04 May 2018 01:40:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6949: Add the option to start the minicluster with EC enabled

2018-05-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/10275 )

Change subject: IMPALA-6949: Add the option to start the minicluster with EC 
enabled
..

IMPALA-6949: Add the option to start the minicluster with EC enabled

In this patch we add the "ERASURE_CODING" enviornment variable. If we
enable it, a cluster with 5 data nodes will be created during data
loading and HDFS will be started with erasure coding enabled.

Testing:
I ran the core build, and verified that erasure coding gets enabled in
HDFS. Many of our EE tests failed however.

Change-Id: I397aed491354be21b0a8441ca671232dca25146c
---
M bin/impala-config.sh
M bin/run-all-tests.sh
M testdata/bin/create-load-data.sh
M testdata/bin/setup-hdfs-env.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
6 files changed, 41 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/10275/4
--
To view, visit http://gerrit.cloudera.org:8080/10275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Gerrit-Change-Number: 10275
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@563
PS3, Line 563:
 : // Use a non-existent database.
 : authorize("use nodb").error(accessError("nodb.*.*"));
> I think the additional levels are unnecessary for default, but if they are
Done


http://gerrit.cloudera.org:8080/#/c/10269/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@573
PS3, Line 573: .ok(onServer(TPrivilegeLevel.INSERT))
 : .ok(onDatabase("functional", TPrivilegeLevel.ALL))
 : .ok(onDatabase("functional", TPrivilegeLevel.INSERT))
> Same as above.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Fri, 04 May 2018 02:41:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 2): Clean up authorization tests

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/10269 )

Change subject: IMPALA-6802 (part 2): Clean up authorization tests
..

IMPALA-6802 (part 2): Clean up authorization tests

The second part of this patch is to rewrite the following
authorization tests:
- insert
- truncate
- load
- use

Testing:
- Added new authorization tests
- Ran all front-end tests

Cherry-picks: not for 2.x

Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
---
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
1 file changed, 308 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/10269/4
--
To view, visit http://gerrit.cloudera.org:8080/10269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ica5bfd2f12bf0cc8ebe78464a523a7805366f67b
Gerrit-Change-Number: 10269
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..

IMPALA-6337: Fix infinite loop in Impala shell

This patch fixes a bug in sqlparse where sqlparse incorrectly splits a
statement that has a new line inside double quotes. The bug in sqlparse
causes Impala shell to go to infinite loop when a statement contains a
new line inside double quotes.

The patch in sqlparse is based on the upstream fix at
https://github.com/andialbrecht/sqlparse/pull/396

Testing:
- Added new end-to-end shell tests
- Ran end-to-end shell tests

Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
---
M shell/ext-py/sqlparse-0.1.14/sqlparse/lexer.py
M shell/ext-py/sqlparse-0.1.14/tests/test_split.py
M tests/shell/test_shell_interactive.py
3 files changed, 25 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/9195/12
-- 
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 12
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-03 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 12:

> Patch Set 11:
>
> Fredy,
>
> Can you leave pointers in the patch code to indicate that it's fixed upstream 
> (and where-ish)?

Created a new patchseet with the indication that the issue is fixed upstream 
and a link to the fix.


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 12
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 04 May 2018 02:48:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6507: remove --disable mem pools debug feature

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10294 )

Change subject: IMPALA-6507: remove --disable_mem_pools debug feature
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Gerrit-Change-Number: 10294
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 04 May 2018 02:53:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6507: remove --disable mem pools debug feature

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10294 )

Change subject: IMPALA-6507: remove --disable_mem_pools debug feature
..

IMPALA-6507: remove --disable_mem_pools debug feature

Save some maintenance overhead by simplifying memory
allocation code paths. ASAN poisoning provides the same general
functionality and is on by default.

Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Reviewed-on: http://gerrit.cloudera.org:8080/10294
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 
---
M be/src/common/global-flags.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/free-pool.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/mem-pool.cc
5 files changed, 4 insertions(+), 41 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I78062569448fed0d076ec506eb8e097ce44d9fea
Gerrit-Change-Number: 10294
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..

IMPALA-6968: Fix TestBlockVerification flakiness

The bug is that the byte in the encrypted data is '?' around 1/256 runs
of the test. Instead, flip a bit in the original data so that it's
always different from the input.

Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Reviewed-on: http://gerrit.cloudera.org:8080/10301
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/tmp-file-mgr-test.cc
1 file changed, 7 insertions(+), 3 deletions(-)

Approvals:
  Sailesh Mukil: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6968: Fix TestBlockVerification flakiness

2018-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10301 )

Change subject: IMPALA-6968: Fix TestBlockVerification flakiness
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdf063ff32848035af667c7cd2a1268f5b785cfe
Gerrit-Change-Number: 10301
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 04 May 2018 03:10:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-03 Thread Vuk Ercegovac (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10291

to look at the new patch set (#2).

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
---
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
5 files changed, 254 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/2
--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-03 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 2:

(1 comment)

several changes:
- downgraded several lines to python 2.6
- ran into trouble with unique_database fixture + custom cluster when
  explicitly killing catalogd so I factored out a wrapper from another test
  that handles setup/cleanup.

http://gerrit.cloudera.org:8080/#/c/10291/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10291/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-6948,IMPALA-6962: add end-to-end tests
> best practice is to spell out IMPALA-6962 so various tools can find it
Done



--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 04 May 2018 05:15:36 +
Gerrit-HasComments: Yes