[Impala-ASF-CR] IMPALA-7051: Serialize Maven invocations.

2018-05-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10460


Change subject: IMPALA-7051: Serialize Maven invocations.
..

IMPALA-7051: Serialize Maven invocations.

I've observed some rare cases where Impala fails to build. I believe
it's because two Maven targets (yarn-extras and ext-data-source) are
being executed simultaneously. Maven's handling of ~/.m2/repository,
for example, is known to be not safe.

This patch serializes the Maven builds with the following
dependency graph:
  fe -> yarn-extras -> ext-data-source -> impala-parent
The ordering of yarn-extras -> ext-data-source is arbitrary.

I decided that this artificial dependency was the clearest
way to prevent parallel executions. Having mvn-quiet.sh
take a lock seemed considerably more complex.

Change-Id: Ie24f34f421bc7dcf9140938464d43400da95275e
---
M common/yarn-extras/CMakeLists.txt
1 file changed, 4 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10460/1
--
To view, visit http://gerrit.cloudera.org:8080/10460
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie24f34f421bc7dcf9140938464d43400da95275e
Gerrit-Change-Number: 10460
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 14
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Sat, 19 May 2018 03:02:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 13:

This change did not cherrypick successfully into branch 2.x. To resolve this, 
please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or 
add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, 
your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/518/ 
.


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 19 May 2018 01:57:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10455 )

Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/


--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 3
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 19 May 2018 01:50:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8523

to look at the new patch set (#14).

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..

IMPALA-5931: Generates scan ranges in planner for s3/adls

Currently, for filesystems that do not include physical
block information (e.g., block replica locations, caching),
synthetic blocks are generated and stored in the catalog
when metadata is loaded. Example file systems for which this is done
includes S3, ADLS, and local fs.

This change avoids generating these blocks when metadata is loaded.
Instead, scan ranges are directly generated from such files by the
backend coordinator. Previously, all scan ranges were produced by
the planner in HDFSScanNode in the frontend. Now, those files without
block information are sent to the coordinator represented by a split
specification that determines how the coordinator will create scan ranges
to send to executors.

This change reduces the space needed in the catalog and reduces the
scan range data structures that are passed from the frontend to the
backend when planning and coordinating a query.
In addition a bug is avoided where non-splittable files were being
split anyways to support the query parameter that places a limit on
scan ranges.

Testing:
- added backend scheduler tests
- mixed-filesystems test covers tables/queries with multiple fs's.
- local fs tests cover the code paths in this change
- all core tests pass when configured with s3
- manually tried larger local filesystem tables (tpch) with multiple
  partitions and observed the same scan ranges.
  - TODO: adls testing

Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
---
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/util/CMakeLists.txt
A be/src/util/flat_buffer.cc
A be/src/util/flat_buffer.h
M common/thrift/Frontend.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
21 files changed, 676 insertions(+), 241 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/14
--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 14
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 13:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302
PS13, Line 302: for (const TScanRangeLocationList& range : 
entry.second.concrete_range) {
  :   expanded_locations.push_back(range);
  : }
> that could be just expanded_locations.insert(concrete_range.begin(), concre
Done


http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108
PS13, Line 108: concrete_range
> plural since the field is a list: concrete_ranges, split_specs
Done


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224
PS13, Line 224: scanRangeSpecs_.getSplit_specSize()
> this seems wrong - a single spec may result in multiple hosts, no?
removed. this class never adds specs, but added a precondition check in init in 
the event that's changed for some reason.


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126
PS13, Line 1126: scanRangeSpecs_.getSplit_specSize();
> shouldn't that do some calculation based on file size and blocks size and i
good catch. needed to change other places that directly use the size of the 
spec list.



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 13
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Sat, 19 May 2018 01:39:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7039: Ignore the port in HBase planner tests

2018-05-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10459


Change subject: IMPALA-7039: Ignore the port in HBase planner tests
..

IMPALA-7039: Ignore the port in HBase planner tests

Before this patch, we used to check the HBase port in the HBase planner
tests. This caused a failure when HBase was running on a different port
than expected. We fix the problem in this patch by not checking the
HBase port.

Testing: ran the FE tests and they passed.

Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0
---
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
3 files changed, 20 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/10459/1
--
To view, visit http://gerrit.cloudera.org:8080/10459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0
Gerrit-Change-Number: 10459
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting

2018-05-18 Thread Alex Rodoni (Code Review)
Hello Balazs Jeszenszky, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10457

to look at the new patch set (#2).

Change subject: IMPALA-7050: [DOCS] Document the max serialized incremental 
stat size setting
..

IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting

Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e
---
M docs/topics/impala_perf_stats.xml
M docs/topics/impala_shell_options.xml
2 files changed, 31 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10457/2
--
To view, visit http://gerrit.cloudera.org:8080/10457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e
Gerrit-Change-Number: 10457
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting

2018-05-18 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10457


Change subject: IMPALA-7050: [DOCS] Document the max serialized incremental 
stat size setting
..

IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting

Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e
---
M docs/topics/impala_perf_stats.xml
1 file changed, 23 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10457/1
--
To view, visit http://gerrit.cloudera.org:8080/10457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e
Gerrit-Change-Number: 10457
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Sat, 19 May 2018 00:37:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..

IMPALA-6070: Adding ASAN, --tail to test-with-docker.

* Adds -ASAN suites to test-with-docker.
* Adds --tail flag, which starts a tail subprocess. This
  isn't pretty (there's potential for overlap), but it's a dead simple
  way to keep an eye on what's going on.
* Fixes a bug wherein I could call "docker rm " twice
  simultaneously, which would make Docker fail the second call,
  and then fail the related "docker rmi". It's better to serialize,
  and I did that with a simple lock.

Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Reviewed-on: http://gerrit.cloudera.org:8080/10319
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M docker/entrypoint.sh
M docker/test-with-docker.py
2 files changed, 48 insertions(+), 7 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10365 )

Change subject: IMPALA-6035: Add query options to limit thread reservation
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10365/6/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10365/6/be/src/scheduling/admission-controller.cc@419
PS6, Line 419: DCHECK_GT(e.second.instance_params.size(), 0);
after switch to using "e.first" instead of "e.second.instance_params[0]->host" 
I think we can get rid of this DCHECK. Or was this used for another reason?


http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc@672
PS4, Line 672: Only -1
> Ideally no, but I wanted to keep it consistent with other options that beha
makes sense. Lets document the -1 option in the thrift declaration too then



--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 19 May 2018 00:34:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..

Moving default sanitizer options into init.cc from shell scripts.

When running tests with ASAN, you need to set ASAN_OPTIONS explicitly,
to avoid various failures. In particular, backend tests fail complaining
about memory leaks and tests that use the parquet-reader binary complain
similarly. It turns out that we can shove the default options into our
code base directly, avoiding the need for users to set it explicitly.

I've done the same thing for TSAN and UBSAN.

I've manually checked that these are being read. In the UBSAN case, I
checked both with gdb and with inotifywatch on the suppressions file.

Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Reviewed-on: http://gerrit.cloudera.org:8080/10404
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 
---
M be/CMakeLists.txt
M be/src/common/init.cc
M bin/run-backend-tests.sh
M bin/start-catalogd.sh
M bin/start-impalad.sh
M bin/start-statestored.sh
6 files changed, 28 insertions(+), 16 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 19 May 2018 00:26:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10365 )

Change subject: IMPALA-6035: Add query options to limit thread reservation
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@414
PS4, Line 414:   int64_t max_min_mem_reservation_bytes = -1;
> (yet another rename nit for this variable. Dont feel too strongly about it
Done


http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@414
PS4, Line 414: int64_t max_min_mem_reservation_bytes = -1;
 :   const TNetworkAddress* max_min_mem_reservation_bytes_addr = 
nullptr;
 :   int64_t cluster_min_mem_reservation_bytes = 0;
 :   int64_t max_thread_reservation = 0;
 :   const TNetworkAddress* max_thread_reservation_addr = nullptr;
 :   int64_t cluster_thread_reservation = 0;
> it it dosent hurt readability then we can use a pair for b
Done


http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@424
PS4, Line 424: e.second.instance_params[0]
> can use e.first instead of this since
Done


http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc@672
PS4, Line 672: Only -1
> do we need both 0 and -1 to represent "no limit"?
Ideally no, but I wanted to keep it consistent with other options that behave 
like this.

I guess one alternative is to make "0" mean literally zero and have it reject 
all queries :). But I think that might cause more confusion.



--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 19 May 2018 00:03:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-18 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10365

to look at the new patch set (#5).

Change subject: IMPALA-6035: Add query options to limit thread reservation
..

IMPALA-6035: Add query options to limit thread reservation

Adds two options: THREAD_RESERVATION_LIMIT and
THREAD_RESERVATION_AGGREGATE_LIMIT, which are both enforced by admission
control based on planner resource requirements and the schedule. The
mechanism used is the same as the minimum reservation checks.

THREAD_RESERVATION_LIMIT limits the total number of reserved threads in
fragments scheduled on a single backend.
THREAD_RESERVATION_AGGREGATE_LIMIT limits the sum of reserved threads
across all fragments.

This also slightly improves the minimum reservation error message to
include the host name.

Testing:
Added end-to-end tests that exercise the code paths.

Ran core tests.

Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
A testdata/workloads/functional-query/queries/QueryTest/resource-limits.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
A tests/query_test/test_resource_limits.py
12 files changed, 248 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10365/5
--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10396 )

Change subject: IMPALA-3134: Support different proc mem limits among impalads 
for admission control checks
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc@408
PS3, Line 408:  =
 :   make_pair(nullptr, std::numeric_limits::max())
I think you can just initialise this directly:


  pair min_proc_mem_limit(nullptr, 
std::numeric_limits::max());



--
To view, visit http://gerrit.cloudera.org:8080/10396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Gerrit-Change-Number: 10396
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 23:54:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-18 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10060

to look at the new patch set (#11).

Change subject: IMPALA-5216: Make admission control queuing async
..

IMPALA-5216: Make admission control queuing async

Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.

Testing:
- Added a deterministic end to end test
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.

TODO: change terminology of "in_flight_query" to "submitted_queries".
  Need to identify all references of this terminology, eg. in
  comments, tests, variable names, etc.

Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
---
M be/src/common/atomic.h
M be/src/common/logging.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/promise-test.cc
M be/src/util/promise.h
M common/thrift/ImpalaService.thrift
M tests/authorization/test_authorization.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_connection.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
M www/query_backends.tmpl
29 files changed, 636 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/11
--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 11
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147
PS7, Line 1147:   return beeswax::QueryState::EXCEPTION;
  : }
  :   }
  : }
  :
  : void ClientRequestState::FinishExecQueryOrDmlRequest() {
  : #ifndef NDEBUG
  :   SleepIfSetInDebugOptions(schedule_->query_options(), 
SLEEP_BEFORE_ADMISSION_MS);
  : #endif
  :   DCHECK(exec_env_->admission_controller() != nullptr);
  :   Status admit_status = 
ExecEnv::GetInstance()->admission_controller()->AdmitQuery(
  :   s
> Yes, it is equivalent to cancellation happening right after this check, whi
Done.



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 10
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 23:24:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 13:

(4 comments)

Thanks, that looks simpler and clearer now. Just some minor things.

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302
PS13, Line 302: for (const TScanRangeLocationList& range : 
entry.second.concrete_range) {
  :   expanded_locations.push_back(range);
  : }
that could be just expanded_locations.insert(concrete_range.begin(), 
concrete_range.end())?


http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108
PS13, Line 108: concrete_range
plural since the field is a list: concrete_ranges, split_specs


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224
PS13, Line 224: scanRangeSpecs_.getSplit_specSize()
this seems wrong - a single spec may result in multiple hosts, no?
Though I guess for hbase this won't be set so in practice doesn't matter. But 
should we instead just assert that the split_spec list size is 0?


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126
PS13, Line 1126: scanRangeSpecs_.getSplit_specSize();
shouldn't that do some calculation based on file size and blocks size and is 
splittable? i.e. after your change we'll get a different number for 
numRemoteRanges when running on S3, right?



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 13
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 18 May 2018 23:09:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@129
PS4, Line 129: static FileDescriptor createEc(FileStatus fileStatus, 
BlockLocation[] blockLocations,
The logic here seems almost identical to the create() function. Maybe modify 
the create() function to accept the isEC argument instead of creating a new 
function?



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 23:05:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Testing: It passes core tests.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 88 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/4
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10455 )

Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/


--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 3
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 22:28:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10455 )

Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..


Patch Set 3: Code-Review+2

Fixed a typo in commit msg, carrying +2.


--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 3
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 22:28:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread David Knupp (Code Review)
Hello Philip Zeyliger, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10455

to look at the new patch set (#3).

Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..

IMPALA-6317: Add -cmake_only option to buildall.sh

It's sometimes useful to be able to build a complete Impala dev
environment without necessarily building the Impala binary itself
-- e.g., when one wants to use the internal test framework to run
tests against an instance of Impala running on a remote cluster.

- This patch adds a -cmake_only flag to buildall.sh, which then
  gets propagated to the make_impala.sh.

- Added a missing line to the help text re: passing the -ninja
  command line option

Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
---
M buildall.sh
1 file changed, 5 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10455/3
--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 3
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG@12
PS3, Line 12:
Mention how this patch was tested. Did you run the exhaustive build?


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@29
PS3, Line 29: IS_ISILON,
: IS_LOCAL,
: IS_HDFS,
: IS_S3,
: IS_ADLS,
: SECONDARY_FILESYSTEM,
: IS_EC
This should be sorted alphabetically


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@150
PS3, Line 150: doesn't work.
"do not work"


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py
File tests/util/filesystem_utils.py:

http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py@33
PS3, Line 33: os.getenv("ERASURE_CODING")
We want to check if ERASURE_CODING == true here.



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 21:57:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 13: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2507/


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 21:55:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10455 )

Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..


Patch Set 2: Code-Review+2

Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 2
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 21:53:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh

2018-05-18 Thread David Knupp (Code Review)
David Knupp has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10455


Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh
..

IMPALA-6317: Add -cmake_only option to buildall.sh

It's sometimes useful to be able to build a complete Impala dev
environment without necessarily building the Impala binary itself
-- e.g., when one wants to use the internal test framework to run
tests against an instance of Impala running on a remote cluster.

- This patch add a -cmake_only flag to buildall.sh, which then
  gets propagated to the make_impala.sh.

- Added a missing line to the help text re: passing the -ninja
  command line option

Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
---
M buildall.sh
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10455/2
--
To view, visit http://gerrit.cloudera.org:8080/10455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f
Gerrit-Change-Number: 10455
Gerrit-PatchSet: 2
Gerrit-Owner: David Knupp 


[Impala-ASF-CR] [DOCS] Fixed misleading documentation on Impala + HDFS caching

2018-05-18 Thread Alex Rodoni (Code Review)
Alex Rodoni has removed Jim Apple from this change.  ( 
http://gerrit.cloudera.org:8080/10454 )

Change subject: [DOCS] Fixed misleading documentation on Impala + HDFS caching
..


Removed reviewer Jim Apple.
--
To view, visit http://gerrit.cloudera.org:8080/10454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7
Gerrit-Change-Number: 10454
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] [DOCS] Fixed misleading documentation on Impala + HDFS caching

2018-05-18 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10454


Change subject: [DOCS] Fixed misleading documentation on Impala + HDFS caching
..

[DOCS] Fixed misleading documentation on Impala + HDFS caching

Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7
---
M docs/topics/impala_perf_hdfs_caching.xml
1 file changed, 5 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/10454/1
--
To view, visit http://gerrit.cloudera.org:8080/10454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7
Gerrit-Change-Number: 10454
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 6:

> Patch Set 6:
>
> I kicked off https://jenkins.impala.io/job/test-with-docker-parameterized/13 
> with this change to see how it'll work. The logic in impala-config.sh is 
> affected by cgroups which might be set in Docker.
>
> I didn't look to see why GVO 2501 failed.

GVO 2501 failed because I forgot to git add a file and I didn't run it on 
jenkins after the renaming. I'm testing it and will re-run the docker job 
afterwards. Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 21:29:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8523

to look at the new patch set (#13).

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..

IMPALA-5931: Generates scan ranges in planner for s3/adls

Currently, for filesystems that do not include physical
block information (e.g., block replica locations, caching),
synthetic blocks are generated and stored in the catalog
when metadata is loaded. Example file systems for which this is done
includes S3, ADLS, and local fs.

This change avoids generating these blocks when metadata is loaded.
Instead, scan ranges are directly generated from such files by the
backend coordinator. Previously, all scan ranges were produced by
the planner in HDFSScanNode in the frontend. Now, those files without
block information are sent to the coordinator represented by a split
specification that determines how the coordinator will create scan ranges
to send to executors.

This change reduces the space needed in the catalog and reduces the
scan range data structures that are passed from the frontend to the
backend when planning and coordinating a query.
In addition a bug is avoided where non-splittable files were being
split anyways to support the query parameter that places a limit on
scan ranges.

Testing:
- added backend scheduler tests
- mixed-filesystems test covers tables/queries with multiple fs's.
- local fs tests cover the code paths in this change
- all core tests pass when configured with s3
- manually tried larger local filesystem tables (tpch) with multiple
  partitions and observed the same scan ranges.
  - TODO: adls testing

Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
---
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/util/CMakeLists.txt
A be/src/util/flat_buffer.cc
A be/src/util/flat_buffer.h
M common/thrift/Frontend.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
21 files changed, 665 insertions(+), 240 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/13
--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 13
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-18 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift@375
PS12, Line 375:   per_node_scan_ranges
> Thanks, the new thrift structures make more sense to me. Maybe it's simpler
yeah, it lets me revert to upfront expansion as I had couple of changes back 
(but without mixed concrete/spec) and get rid of that iterator.


http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift@202
PS12, Line 202: THdfsFileSplitGeneratorSpec
> update
Done



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 18 May 2018 21:24:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 6:

I kicked off https://jenkins.impala.io/job/test-with-docker-parameterized/13 
with this change to see how it'll work. The logic in impala-config.sh is 
affected by cgroups which might be set in Docker.

I didn't look to see why GVO 2501 failed.


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 21:23:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE

2018-05-18 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10452


Change subject: IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE
..

IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE

Change-Id: Ib680f25d8c929e1194a2274777f83851b04bcb93
---
M docs/topics/impala_alter_table.xml
1 file changed, 36 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/10452/1
--
To view, visit http://gerrit.cloudera.org:8080/10452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib680f25d8c929e1194a2274777f83851b04bcb93
Gerrit-Change-Number: 10452
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2510/


--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 May 2018 21:15:36 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10446 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 May 2018 21:14:09 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10446 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..

IMPALA-7035: Configure jceks.key.serialFilter for KMS.

Configures a Java property for KMS to account for JDK 8u171's security fixes. I
was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the
following error:

  AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore

The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. 
In the
meanwhile, setting this property seems to do the trick.

Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Reviewed-on: http://gerrit.cloudera.org:8080/10418
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
Reviewed-on: http://gerrit.cloudera.org:8080/10446
---
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
1 file changed, 6 insertions(+), 0 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6:

Jenkins is complaining about our HBase planner tests, which have been flaky 
recently. The "all-builds" job succeeded.

Anyway, I asked Jenkins to try again.


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 21:11:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2509/


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 21:11:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2506/


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 21:04:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc@102
PS4, Line 102: consumer_cv_.NotifyAll();
> We could make these NotifyOne() but I think it's inconsequential
yeah seems fine to me the way it was since there's no important difference in 
this case.



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 21:01:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Hello Michael Ho, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10449

to look at the new patch set (#5).

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..

IMPALA-7011: Simplify PlanRootSink control logic

1) The eos_ and sender_done_ bits really encode three possible states
   that the sender can be in. Make this explicit using an enum with
   three values.

2) The purpose of CloseConsumer() has changed over time and we can clean
   this up now:

 a) Originally, it looks like it was used to unblock the sender when the
   consumer finishes before eos, but also keep the sink alive long
   enough for the coordinator. This is no longer necessary now that
   control structures are owned by the QueryState whose lifetime is
   controlled by a reference count taken by the coordinator. So, we don't
   need the coordinator to tell the sink it's done calling it and we
   don't need the consumer_done_ state.

 b) Later on, CloseConsumer() was used as a cancellation mechinism.
   We need to keep this around (or use timeouts on the condvars) to kick
   both the consumer and producer on cancellation. But let's make the
   cancellation logic similar to the exec nodes and other sinks by
   driving the cancellation using the RuntimeState's cancellation
   flag. Now that CloseConsumer() is only about cancellation, rename it
   to Cancel() (later we may promote it to DataSink and implement in the
   data stream sender as well).

Testing:
- Exhaustive
- Minicluster concurrent_select.py stress

Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
---
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
4 files changed, 58 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/5
--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h
File be/src/exec/plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@49
PS3, Line 49:
> dup
Done



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 20:59:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 13
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 20:48:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..


Patch Set 4: Code-Review+1

(1 comment)

LGTM, someone else should +2 once everyone is happy

http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc@102
PS4, Line 102: consumer_cv_.NotifyAll();
We could make these NotifyOne() but I think it's inconsequential



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 20:47:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h
File be/src/exec/plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@49
PS3, Line 49: the
dup



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 20:43:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@7
PS3, Line 7: IMPALA_7011
> nit: IMPALA-7011
Done


http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@20
PS3, Line 20: control
> nit: controlled
Done


http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@24
PS3, Line 24: to
> nit: extra word?
Done


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h
File be/src/exec/plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@57
PS3, Line 57: respecively
> respectively
Done


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@102
PS3, Line 102: Canecl
> Cancel
Done


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@106
PS3, Line 106:   /// State of the sender:
> Producer instead of sender? Seems inconsistent with above.
Did the opposite, only because the code variables use sender more (so changed 
the comments from producer to sender above).


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@107
PS3, Line 107: MORE_ROWS
> nit: how about ROWS_PENDING
Done


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@a76
PS3, Line 76:
> just curious, was this dead code all along or is there ever a possibility o
I didn't look back at git history, but like you said, it's clearly a dead 
conditional. It may have been possible to pass batch == nullptr before the 
FragmentInstanceState refactor.

Okay, looked at git history, and this was a do-while loop (so the dereference 
wasn't there. But still, ValidateCollectionSlots() dereferences. Plus, it 
doesn't make sense to do the check after the Wait() anyway. So, I'm not really 
sure why it was written.


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@116
PS3, Line 116: enounters
> encounters
Done



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 20:19:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Hello Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10449

to look at the new patch set (#4).

Change subject: IMPALA-7011: Simplify PlanRootSink control logic
..

IMPALA-7011: Simplify PlanRootSink control logic

1) The eos_ and sender_done_ bits really encode three possible states
   that the sender can be in. Make this explicit using an enum with
   three values.

2) The purpose of CloseConsumer() has changed over time and we can clean
   this up now:

 a) Originally, it looks like it was used to unblock the sender when the
   consumer finishes before eos, but also keep the sink alive long
   enough for the coordinator. This is no longer necessary now that
   control structures are owned by the QueryState whose lifetime is
   controlled by a reference count taken by the coordinator. So, we don't
   need the coordinator to tell the sink it's done calling it and we
   don't need the consumer_done_ state.

 b) Later on, CloseConsumer() was used as a cancellation mechinism.
   We need to keep this around (or use timeouts on the condvars) to kick
   both the consumer and producer on cancellation. But let's make the
   cancellation logic similar to the exec nodes and other sinks by
   driving the cancellation using the RuntimeState's cancellation
   flag. Now that CloseConsumer() is only about cancellation, rename it
   to Cancel() (later we may promote it to DataSink and implement in the
   data stream sender as well).

Testing:
- Exhaustive
- Minicluster concurrent_select.py stress

Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
---
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
4 files changed, 58 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/4
--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Sentry is required for Impala to enable delegation

2018-05-18 Thread Alex Rodoni (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10451

to look at the new patch set (#2).

Change subject: [DOCS] Sentry is required for Impala to enable delegation
..

[DOCS] Sentry is required for Impala to enable delegation

Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb
---
M docs/topics/impala_delegation.xml
1 file changed, 42 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10451/2
--
To view, visit http://gerrit.cloudera.org:8080/10451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb
Gerrit-Change-Number: 10451
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] [DOCS] Sentry is required for Impala to enable delegation

2018-05-18 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10451


Change subject: [DOCS] Sentry is required for Impala to enable delegation
..

[DOCS] Sentry is required for Impala to enable delegation

Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb
---
M docs/topics/impala_delegation.xml
1 file changed, 43 insertions(+), 51 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10451/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb
Gerrit-Change-Number: 10451
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10396 )

Change subject: IMPALA-3134: Support different proc mem limits among impalads 
for admission control checks
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc@408
PS3, Line 408: pair min_proc_mem_limit =
 :   make_pair(nullptr, std::numeric_limits::max());
> @Tim: let me know if you decide not to adopt this pattern for your patch fo
WFM


http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py@454
PS3, Line 454: exec_options['mem_limit'] = "3G"
Does this line do anything? It's already set to 3G above.
Is it intentional that num_nodes=1 still at this point?

It might be easier to follow if we just created a copy of exec_options for each 
query rather than mutating the one version. We use the copy() function in a few 
other tests for this purpose. What do you think?


http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py@473
PS3, Line 473:"2.00 GB but only 1.00 GB out of 2.00 GB 
was available.", str(e)), str(e)
long line



--
To view, visit http://gerrit.cloudera.org:8080/10396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Gerrit-Change-Number: 10396
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 19:20:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic

2018-05-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA_7011: Simplify PlanRootSink control logic
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h
File be/src/exec/plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@57
PS3, Line 57: respecively
respectively


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@106
PS3, Line 106:   /// State of the sender:
Producer instead of sender? Seems inconsistent with above.


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@116
PS3, Line 116: enounters
encounters



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 19:00:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 3:

Using the original block length/offset breaks more tests. I disabled them.


--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 18:50:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 87 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/3
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic

2018-05-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA_7011: Simplify PlanRootSink control logic
..


Patch Set 3: Code-Review+1

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@7
PS3, Line 7: IMPALA_7011
nit: IMPALA-7011


http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@20
PS3, Line 20: control
nit: controlled


http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@24
PS3, Line 24: to
nit: extra word?


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h
File be/src/exec/plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@107
PS3, Line 107: MORE_ROWS
nit: how about ROWS_PENDING


http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@a76
PS3, Line 76:
just curious, was this dead code all along or is there ever a possibility of 
passing a null batch to Send()(granted that would crash at L67 anyway or at L73 
on a release build)?



--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 18:29:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2507/


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 18:03:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 18:02:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2506/


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 17:31:40 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10446 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2505/


--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 May 2018 17:29:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS12, Line 2868: // this would make it necessary to reload the table 
after alter_table in order to
> I agree that today we will always reload the table metadata from the HMS an
Thanks for the explanation!

I have rewritten the comment, I hope that it is better now.



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 17:04:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10116

to look at the new patch set (#13).

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
---
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M tests/metadata/test_last_ddl_time_update.py
8 files changed, 228 insertions(+), 173 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/13
--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS12, Line 2868: // catalogd would read it back after alter_table, but 
as this would not work for
> Do you have an example in mind when the table should not be reloaded? I loo
I agree that today we will always reload the table metadata from the HMS and 
set a new msTbl. That behavior is not by design though - I think it's 
unintentional and probably a bug (see for example IMPALA-6994).

Personally, I think this comment is difficult to understand because it's 
explanation relies on (1) understanding what callers will do after calling this 
function, and (2) understanding what is done for Kudu tables, so there's some 
loss of abstraction. Imo, it would be clearer to declare a general policy of 
setting the lastDdlTime in Impala to not rely on the HMS setting it and to 
avoid having to fetch it from HMS if possible.



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 16:39:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6994: Avoid reloading a table's HMS data for file-only operations.

2018-05-18 Thread Pranay Singh (Code Review)
Pranay Singh has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10450


Change subject: IMPALA-6994: Avoid reloading a table's HMS data for file-only 
operations.
..

IMPALA-6994: Avoid reloading a table's HMS data for file-only operations.

Catalog currently loads the HMS data even, if only the file metadata load
is requested. This change detects some of those updates and tries to skip
the loading of HMS data if only the file metadata load is requested.

Testing:
  Ran the core test without failures.

Change-Id: Iaabdf38af3f30c65ada9734eb471dbfa6ecdd74a
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
1 file changed, 39 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10450/1
--
To view, visit http://gerrit.cloudera.org:8080/10450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaabdf38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 10450
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS12, Line 2868: // catalogd would read it back after alter_table, but 
as this would not work for
> I don't think it's true that we'd always read it back after calling this fu
Do you have an example in mind when the table should not be reloaded? I looked 
at the callers of applyAlterTable(): loadTableMetadata() is called after 
alter*() + dropTableStats(), and createTable() only creates an "incomplete 
table", so the table will be reloaded before it is actually used. Because of 
this, my impression was that tables are always reloaded by design.



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 15:43:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic

2018-05-18 Thread Dan Hecht (Code Review)
Dan Hecht has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10449 )

Change subject: IMPALA_7011: Simplify PlanRootSink control logic
..

IMPALA_7011: Simplify PlanRootSink control logic

1) The eos_ and sender_done_ bits really encode three possible states
   that the sender can be in. Make this explicit using an enum with
   three values.

2) The purpose of CloseConsumer() has changed over time and we can clean
   this up now:

 a) Originally, it looks like it was used to unblock the sender when the
   consumer finishes before eos, but also keep the sink alive long
   enough for the coordinator. This is no longer necessary now that
   control structures are owned by the QueryState whose lifetime is
   control by a reference count taken by the coordinator. So, we don't
   need the coordinator to tell the sink it's done calling it and we
   don't need the consumer_done_ state.

 b) Later on, CloseConsumer() was used to as a cancellation mechinism.
   We need to keep this around (or use timeouts on the condvars) to kick
   both the consumer and producer on cancellation. But let's make the
   cancellation logic similar to the exec nodes and other sinks by
   driving the cancellation using the RuntimeState's cancellation
   flag. Now that CloseConsumer() is only about cancellation, rename it
   to Cancel() (later we may promote it to DataSink and implement in the
   data stream sender as well).

Testing:
- Exhaustive
- Minicluster concurrent_select.py stress

Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
---
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
4 files changed, 57 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/3
--
To view, visit http://gerrit.cloudera.org:8080/10449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f
Gerrit-Change-Number: 10449
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-18 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py@135
PS12, Line 135: selected as the pivot (the old method), the sorter tends to 
get stuck selecting the
> I gave this a second look and apparently I mis-measured something the last
As discussed offline, I couldn't reduce the runtime of this test. As agreed, 
I'm dropping it as the majority of the spilling sorts have already been covered 
by the existing tests.



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 13
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 10:20:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-18 Thread Gabor Kaszab (Code Review)
Hello Tim Armstrong, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9943

to look at the new patch set (#13).

Change subject: IMPALA-5706: Spilling sort optimisations
..

IMPALA-5706: Spilling sort optimisations

This patch covers multiple changes with the purpose of optimizing
spilling sort mechanism:
  - Remove the hard-coded maximum limit of buffers that can be used
for merging the sorted runs. Instead this number is calculated
based on the available memory through buffer pool.
  - The already sorted runs are distributed more optimally between
the last intermediate merge and the final merge to avoid that a
heavy intermediate merge is followed by a light final merge.
  - Right before starting the merging phase Sorter tries to allocate
additional memory through the buffer pool.
  - An output run is not allocated anymore for the final merge.

Note, double-buffering the runs during a merge was also planned with
this patch. However, performance testing showed that except some
exotic queries with unreasonably small amount of buffer pool memory
available double-buffering doesn't add to the overall performance.
It's basically because the half of the available buffers have to be
sacrificed to do double-buffering and as a result the merge tree can
get deeper. In addition the amount of I/O wait time is not reaching
the level where double-buffering could countervail the reduced number
of runs during a particular merge.

Performance measurements were made during manual testing to verify
that this is in fact an optimization:
  - In case doing a sort on top of a join when working with a
restricted amount of memory then the Sort node successfully
allocates additional memory right before the merging phase. This
is feasible because once Join finishes sending new input data and
calls InputDone() then it releases memory that can be picked up
by the Sorter. This results in shallower merging trees (more runs
grabbed for a merge).
  - On a multi-node cluster I verified that in cases when at least one
merging step is done then this change reduces the execution time
for sorts.
  - The more merging steps are done the bigger the performance gain is
compared to the baseline.

Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
---
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M tests/query_test/test_sort.py
3 files changed, 149 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/9943/13
--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 13
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-18 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 07:08:19 +
Gerrit-HasComments: No