[Impala-ASF-CR] [WIP] IMPALA-6932: Speed up scans for sequence datasets with many files

2018-10-29 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11517 )

Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with 
many files
..


Patch Set 3:

> > This can't be tested on hdfs since there are no "remote" blocks
 > in
 > > the minicluster. So all the scan ranges of a file are added to
 > the
 > > appropriate local disk queue once the header is processed.
 >
 > This came up in a conversation between me and Joe today as well.
 > Replication in HDFS is per file, so we should be able to "hdfs put"
 > with appropriate options to induce a remote block, even in the
 > minicluster. Unfortunately, it doesn't seem to work with the
 > following sequence:
 >
 > $ impala-shell.sh -q 'create table t (x string)'
 > $ yes | head > /tmp/f
 > $ hadoop fs -D dfs.replication=1 -put /tmp/f /test-warehouse/t
 > $ impala-shell.sh -i localhost:21002 -q 'set num_nodes=1;
 > invalidate metadata t; select * from t limit 2; profile' | grep -i
 > BytesReadShortCircuit
 >
 > Impala seems to be doing short-circuit-read on all the impalad's
 > (presumably because the datanode somewhat reasonably decides things
 > are indeed local).
 >
 > Anyway--this surprised me so I figured I'd mention it.

The scheduler treats all backend *hosts* the same. In particular all reads on 
the minicluster will be assigned to "localhost". Then we have some special 
handling for backend hosts that have multiple impalads running: we assign scan 
ranges round-robin without considering the actual size of each scan range 
(scheduler.cc:890). This is only used during testing and not supported on 
production deployments so we don't try to be very sophisticated.

On second thought I hoped we might be able to provoke a remote read if we add 
multiple files that only reside on the first data node and then scan all of 
them. I tried this and it didn't work. The HDFS file browser shows "localhost" 
as the location of each file, making me think that it does not make a 
distinction between each datanode and instead figures out how to perform a 
short circuit read directly.

The scheduler itself makes the right assignments:

I1029 21:19:09.685812 24983 scheduler.cc:995] ScanRangeAssignment: 
server=TNetworkAddress {
  01: hostname (string) = "lv-desktop",
  02: port (i32) = 22000,
}
I1029 21:19:09.685825 24983 scheduler.cc:1001] node_id=0 
ranges=TScanRangeParams {
  01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
  01: file_name (string) = "f3",
  02: offset (i64) = 0,
  03: length (i64) = 20,
  04: partition_id (i64) = 0,
  05: file_length (i64) = 20,
  06: file_compression (i32) = 0,
  07: mtime (i64) = 1540872321692,
},
  },
  02: volume_id (i32) = 2,
  03: is_cached (bool) = false,
  04: is_remote (bool) = false,
}
I1029 21:19:09.685854 24983 scheduler.cc:995] ScanRangeAssignment: 
server=TNetworkAddress {
  01: hostname (string) = "lv-desktop",
  02: port (i32) = 22002,
}
I1029 21:19:09.685863 24983 scheduler.cc:1001] node_id=0 
ranges=TScanRangeParams {
  01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
  01: file_name (string) = "f2",
  02: offset (i64) = 0,
  03: length (i64) = 20,
  04: partition_id (i64) = 0,
  05: file_length (i64) = 20,
  06: file_compression (i32) = 0,
  07: mtime (i64) = 1540872318716,
},
  },
  02: volume_id (i32) = 1,
  03: is_cached (bool) = false,
  04: is_remote (bool) = false,
}
I1029 21:19:09.685868 24983 scheduler.cc:995] ScanRangeAssignment: 
server=TNetworkAddress {
  01: hostname (string) = "lv-desktop",
  02: port (i32) = 22001,
}
I1029 21:19:09.685875 24983 scheduler.cc:1001] node_id=0 
ranges=TScanRangeParams {
  01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
  01: file_name (string) = "f",
  02: offset (i64) = 0,
  03: length (i64) = 20,
  04: partition_id (i64) = 0,
  05: file_length (i64) = 20,
  06: file_compression (i32) = 0,
  07: mtime (i64) = 1540872285223,
},
  },
  02: volume_id (i32) = 0,
  03: is_cached (bool) = false,
  04: is_remote (bool) = false,
}


--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 04:22:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-6932: Speed up scans for sequence datasets with many files

2018-10-29 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11517 )

Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with 
many files
..


Patch Set 3:

> This can't be tested on hdfs since there are no "remote" blocks in
 > the minicluster. So all the scan ranges of a file are added to the
 > appropriate local disk queue once the header is processed.

This came up in a conversation between me and Joe today as well. Replication in 
HDFS is per file, so we should be able to "hdfs put" with appropriate options 
to induce a remote block, even in the minicluster. Unfortunately, it doesn't 
seem to work with the following sequence:

$ impala-shell.sh -q 'create table t (x string)'
$ yes | head > /tmp/f
$ hadoop fs -D dfs.replication=1 -put /tmp/f /test-warehouse/t
$ impala-shell.sh -i localhost:21002 -q 'set num_nodes=1; invalidate metadata 
t; select * from t limit 2; profile' | grep -i BytesReadShortCircuit

Impala seems to be doing short-circuit-read on all the impalad's (presumably 
because the datanode somewhat reasonably decides things are indeed local).

Anyway--this surprised me so I figured I'd mention it.


--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 03:36:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..

IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

This patch adds the queuing status, that is, whether the query was
queued and what was the latest queuing reason, to the ExecSummary.
Also added changes to allow impala-shell to expose this status by
pulling it out from the ExecSummary when either live_summary or
live_progress is set to true.

Testing:
Added custom cluster tests.

Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Reviewed-on: http://gerrit.cloudera.org:8080/11816
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/service/impala-server.cc
M common/thrift/ExecStats.thrift
M shell/impala_shell.py
M tests/custom_cluster/test_admission_controller.py
A tests/custom_cluster/test_shell_interactive.py
5 files changed, 102 insertions(+), 24 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 02:02:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7166: ExecSummary should be a first class object.

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11555 )

Change subject: IMPALA-7166: ExecSummary should be a first class object.
..

IMPALA-7166: ExecSummary should be a first class object.

Impala RuntimeProfile currently contains "ExecSummary" as a string. We should 
make it a
first class thrift object, so that tools can extract these fields (Est rows 
etc..).

Testing:
Modified unit test.

Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Reviewed-on: http://gerrit.cloudera.org:8080/11555
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/service/impala-server.cc
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
5 files changed, 48 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Gerrit-Change-Number: 11555
Gerrit-PatchSet: 5
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Yongjun Zhang 


[Impala-ASF-CR] IMPALA-7166: ExecSummary should be a first class object.

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11555 )

Change subject: IMPALA-7166: ExecSummary should be a first class object.
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Gerrit-Change-Number: 11555
Gerrit-PatchSet: 4
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Tue, 30 Oct 2018 01:11:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1209/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 8
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:45:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7687: [DOCS] Support for multiple DISTINCT in a query

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11823 )

Change subject: IMPALA-7687: [DOCS] Support for multiple DISTINCT in a query
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/125/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a6e664b016e9408a3ff809f1811253a91764481
Gerrit-Change-Number: 11823
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:38:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7687: [DOCS] Support for multiple DISTINCT in a query

2018-10-29 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11823


Change subject: IMPALA-7687: [DOCS] Support for multiple DISTINCT in a query
..

IMPALA-7687: [DOCS] Support for multiple DISTINCT in a query

- Removed notes about the single DISTINCT restriction.
- Rewrote the description for the APPX_COUNT_DISTINCT query option.

Change-Id: I3a6e664b016e9408a3ff809f1811253a91764481
---
M docs/shared/impala_common.xml
M docs/topics/impala_appx_count_distinct.xml
M docs/topics/impala_count.xml
M docs/topics/impala_distinct.xml
M docs/topics/impala_langref_unsupported.xml
M docs/topics/impala_select.xml
6 files changed, 33 insertions(+), 111 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/11823/1
--
To view, visit http://gerrit.cloudera.org:8080/11823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3a6e664b016e9408a3ff809f1811253a91764481
Gerrit-Change-Number: 11823
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1208/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:24:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 4:

OK. Will do a pass tonight.


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:23:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#8).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
17 files changed, 870 insertions(+), 313 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/8
--
To view, visit http://gerrit.cloudera.org:8080/11760

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11822 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1207/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
Gerrit-Change-Number: 11822
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:19:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/11822 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Abandoned

Duplicate
--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
Gerrit-Change-Number: 11822
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [WIP] IMPALA-6932: Speed up scans for sequence datasets with many files

2018-10-29 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11517 )

Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with 
many files
..


Patch Set 3:

This can't be tested on hdfs since there are no "remote" blocks in the 
minicluster. So all the scan ranges of a file are added to the appropriate 
local disk queue once the header is processed.

> Patch Set 3: -Code-Review
>
> Adding a targeted test that uses some profile counters seems like a great 
> idea. Maybe it's possible to do this already on HDFS with the right query and 
> options. E.g. run a query with limit=1 with num_nodes=1 and either 
> num_scanner_threads=1 or mt_dop=1 and confirm that only one file is opened 
> from the profile.


--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:09:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11800 )

Change subject: IMPALA-6374: fix handling of commas in .test files
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1206/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:02:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 4: Code-Review+1

(2 comments)

Thanks, Tim for the quick reviews. Carrying +1.

http://gerrit.cloudera.org:8080/#/c/11778/3/be/src/util/error-util.cc
File be/src/util/error-util.cc:

http://gerrit.cloudera.org:8080/#/c/11778/3/be/src/util/error-util.cc@231
PS3, Line 231:   DCHECK(false) << "Unexpected hs2Code: " << hs2Code;
> clang-tidy failed because this didn't return a value.
Fixed it. Missed in the previous rev.


http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test:

http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test@2203
PS2, Line 2203: set mem_limit=1m;
> Ah interesting. Maybe consider setting a lower mem_limit just to be sure th
Done



--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 23:48:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

(14 comments)

Thanks, Phil, for the code review. Addressed all but one comment.

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@29
PS7, Line 29: As a result, the BE retains the original interpreted forms that
> Let's mention this in the backend code in comments, so that folks know it's
Done


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@30
PS7, Line 30: are still used in two cases: 1) top-leel conditions in the
> nit: level
Done


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@45
PS7, Line 45: Still, the fix provides most of what the JIRA ticket requested
> Does the specific "compute stats" query that's underlying the JIRA get fast
Haven't gotten that far yet; still trying to make sure that things actually 
work. Will provide an update once I can do the required performance testing.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@54
PS7, Line 54:
> nit: it's weird that there are two spaces there.
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@66
PS7, Line 66: // then want to allow CASE to do any simplification 
(reverting to
> nit: The parenthetical here isn't closed.
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@70
PS7, Line 70:   Expr revised =  rewriteConditionalFn((FunctionCallExpr) 
expr);
> nit: should only have one space after =
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@72
PS7, Line 72:   // Workaround for IMALA-TBD
> Is TBD resolved at this point?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@132
PS7, Line 132:* IFNULL(a, x) --> 
> If we're not using HTML, is the  here noise?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@179
PS7, Line 179:   if (childExpr.isAggregate()) { sawAgg = true; }
> The javadoc for "expr.contains()" suggests that hasAgg has already checked
True.

This check adjusts the non-null literal case. Ignoring aggregates, we can stop 
adding arguments (ignoring all remaining arguments) if we hit a non-null 
literal. (This is the ! hasAgg case.)

Or, if we have aggregates, we can stop if we've already added at least one 
aggregate. Otherwise, we have to keep going to add at least one aggregate -- 
even though the aggregate will never actually be evaluated. (This is the sawAgg 
case.)

But, that I had to explain this means that using dual flags is the wrong 
approach. Rewrote to use a state machine. Please let me know if that is clearer.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@50
PS7, Line 50:  * We can't eliminate aggregates as this may change the meaning 
of the
> This is the same as line 70-72. Is that intentional?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
File fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@42
PS7, Line 42: System.out.println(stmt);
> remove
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@97
PS7, Line 97:   @Test
:   public void testSqlRecovery() {
: ParseNode node = analyzeExpr("true and false");
: System.out.println(node.toSql());
: System.out.println(((SelectStmt) node).toSql(true));
: // SELECT TRUE AND FALSE FROM functional.alltypessmall
:   }
> There's no assertion here...
Ad-hoc test. Removed.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@117
PS7, Line 117: // TODO: IMPALA-7766
> nit: including the subject of the JIRA here may make it obvious why...
Done



[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11822


Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163

Fixes

Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
17 files changed, 870 insertions(+), 313 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/11822/1
--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF

[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11800 )

Change subject: IMPALA-6374: fix handling of commas in .test files
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11800/3/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

http://gerrit.cloudera.org:8080/#/c/11800/3/tests/common/test_result_verifier.py@117
PS3, Line 117: i
> You must mean "i + 1" here, yes?
Oops. Fixed but I'll rerun tests in case more test files need to be updated as 
a result of this.



--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 23:27:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Tim Armstrong (Code Review)
Hello Michael Brown, David Knupp, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11800

to look at the new patch set (#4).

Change subject: IMPALA-6374: fix handling of commas in .test files
..

IMPALA-6374: fix handling of commas in .test files

The .test file parser implemented an unconventional method for parsing
single-quoted strings in comma-separated value format. This didn't handle
trailing commas in the string correctly.

This commit switches to using a conventional method for parsing
comma-separated value format:
* Commas enclosed by single quotes are not treated as field separators
* Single quotes can be escaped within a string by doubling them.

I looked into using Python's .csv module for this, but it wouldn't
work without modifying the test file format more because it
automatically discards the quotes during parsing, which are actually
semantically important in .test files. E.g. without the quotes we can't
distinguish between the literal string 'regex:...' and the regex
regex:

Testing:
Ran exhaustive tests and fixed .test files that required modifications.
Will rerun before merging.

Added a couple of tests to exercise edge cases in the test file parser.

Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
---
M testdata/workloads/functional-query/queries/QueryTest/functions-ddl.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M testdata/workloads/functional-query/queries/QueryTest/hbase-inserts.test
M testdata/workloads/functional-query/queries/QueryTest/misc.test
A testdata/workloads/functional-query/queries/QueryTest/special-strings.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test
M testdata/workloads/tpcds/queries/tpcds-q98.test
M tests/common/test_result_verifier.py
M tests/query_test/test_exprs.py
10 files changed, 157 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/11800/4
--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11800 )

Change subject: IMPALA-6374: fix handling of commas in .test files
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11800/3/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

http://gerrit.cloudera.org:8080/#/c/11800/3/tests/common/test_result_verifier.py@117
PS3, Line 117: i
You must mean "i + 1" here, yes?



--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:54:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] CDH-73537: Skip test default timezone when testing a real cluster.

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11820 )

Change subject: CDH-73537: Skip test_default_timezone when testing a real 
cluster.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1205/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
Gerrit-Change-Number: 11820
Gerrit-PatchSet: 2
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:53:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] CDH-73537: Skip test default timezone when testing a real cluster.

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11820 )

Change subject: CDH-73537: Skip test_default_timezone when testing a real 
cluster.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1204/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
Gerrit-Change-Number: 11820
Gerrit-PatchSet: 1
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:50:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] CDH-73537: Skip test default timezone when testing a real cluster.

2018-10-29 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11820 )

Change subject: CDH-73537: Skip test_default_timezone when testing a real 
cluster.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11820/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11820/2//COMMIT_MSG@7
PS2, Line 7: CDH-73537
We need an upstream Jira here, not one for Cloudera.



--
To view, visit http://gerrit.cloudera.org:8080/11820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
Gerrit-Change-Number: 11820
Gerrit-PatchSet: 2
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:30:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7761: Add multiple DISTINCT to targeted stress and perf

2018-10-29 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11805 )

Change subject: IMPALA-7761: Add multiple DISTINCT to targeted stress and perf
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11805/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11805/1//COMMIT_MSG@16
PS1, Line 16: Testing:
: - Ran the test file locally.
The queries are fine, but what does it mean to run these locally? I'm not too 
familiar with these workloads (targeted-perf, targeted-stress). Do the queries 
get run in GVO? Is it expected downstream larger-scale testing will run these? 
The stress test (concurrent_select.py) won't find these queries, for example.



--
To view, visit http://gerrit.cloudera.org:8080/11805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I400aaf6b6620b4001895eafff785956bffb312c9
Gerrit-Change-Number: 11805
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:29:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1203/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:29:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7614: [DOCS] Document the New Invalidate Options

2018-10-29 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11809 )

Change subject: IMPALA-7614: [DOCS] Document the New Invalidate Options
..


Patch Set 1:

Tiany and Adrian,
Could you please review for the coming release?
Let me know if I should add any other reviewers.
Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/11809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40c552eeaee81ee6528d9f725bd416b51d8ab837
Gerrit-Change-Number: 11809
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:12:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] CDH-73537: Skip test default timezone when testing a real cluster.

2018-10-29 Thread David Knupp (Code Review)
Hello Michael Brown, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11820

to look at the new patch set (#2).

Change subject: CDH-73537: Skip test_default_timezone when testing a real 
cluster.
..

CDH-73537: Skip test_default_timezone when testing a real cluster.

test_shell_commandline.py::test_default_timezone assumes that the
cluster is running on the same platform as the test process, but
that's only guaranteed when the testing a local minicluster. When
run against a real cluster, the test executor can be a completely
different OS.

Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
---
M tests/shell/test_shell_commandline.py
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/11820/2
--
To view, visit http://gerrit.cloudera.org:8080/11820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
Gerrit-Change-Number: 11820
Gerrit-PatchSet: 2
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] [WIP] IMPALA-6932: Speed up scans for sequence datasets with many files

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11517 )

Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with 
many files
..


Patch Set 3: -Code-Review

Adding a targeted test that uses some profile counters seems like a great idea. 
Maybe it's possible to do this already on HDFS with the right query and 
options. E.g. run a query with limit=1 with num_nodes=1 and either 
num_scanner_threads=1 or mt_dop=1 and confirm that only one file is opened from 
the profile.


--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:08:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] CDH-73537: Skip test default timezone when testing a real cluster.

2018-10-29 Thread David Knupp (Code Review)
David Knupp has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11820


Change subject: CDH-73537: Skip test_default_timezone when testing a real 
cluster.
..

CDH-73537: Skip test_default_timezone when testing a real cluster.

test_shell_commandline.py::test_default_timezone assumes that the
cluster is running on the same platform as the test process, but
that's only guaranteed when the testing a local minicluster. When
run against a real cluster, the test executor can be a completely
different OS.

Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
---
M tests/shell/test_shell_commandline.py
1 file changed, 6 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/11820/1
--
To view, visit http://gerrit.cloudera.org:8080/11820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f
Gerrit-Change-Number: 11820
Gerrit-PatchSet: 1
Gerrit-Owner: David Knupp 


[Impala-ASF-CR] [WIP] IMPALA-6932: Speed up scans for sequence datasets with many files

2018-10-29 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11517 )

Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with 
many files
..


Patch Set 3:

I haven't yet tested this on a remote cluster. The plan is to test this by 
adding an s3 test which queries a table with multiple small files and to ensure 
that the number of files opened should not exceed 2. Do you think that may not 
be necessary?


--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:00:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:59:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3376/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:59:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 2: Code-Review+2

carrying over +2


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:58:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Bikramjeet Vig (Code Review)
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11816

to look at the new patch set (#2).

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..

IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

This patch adds the queuing status, that is, whether the query was
queued and what was the latest queuing reason, to the ExecSummary.
Also added changes to allow impala-shell to expose this status by
pulling it out from the ExecSummary when either live_summary or
live_progress is set to true.

Testing:
Added custom cluster tests.

Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
---
M be/src/service/impala-server.cc
M common/thrift/ExecStats.thrift
M shell/impala_shell.py
M tests/custom_cluster/test_admission_controller.py
A tests/custom_cluster/test_shell_interactive.py
5 files changed, 102 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/11816/2
--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 1: Code-Review+2

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11816/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11816/1//COMMIT_MSG@9
PS1, Line 9: This patch adds the queuing status, that is, whether the query was
> Can you mention that this is showed when either live_summary or live_progre
yup, thats already mentioned at the end of this paragraph


http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift
File common/thrift/ExecStats.thrift:

http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift@104
PS1, Line 104: queued
> "is currently queued by admission control" to be totally clear
Done


http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift@107
PS1, Line 107:   // Contains the latest queuing reason if the query is queued.
> Same as above.
Done


http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py
File tests/custom_cluster/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@26
PS1, Line 26: class TestShellInteractive(CustomClusterTestSuite):
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@29
PS1, Line 29:
> flake8: E251 unexpected spaces around keyword / parameter equals
Done


http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@38
PS1, Line 38: proc = pexpect.spawn(' '.join([SHELL_CMD, "-i localhost:21000"]))
> I think it would help connecting to another port (i.e. another impalad) to
Good point! but we already have tests that rely on pool stats being propagated 
across the cluster



--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:58:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1201/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:48:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1202/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:48:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

(14 comments)

Thanks for the updates!

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@29
PS7, Line 29: As a result, the BE retains the original interpreted forms that
Let's mention this in the backend code in comments, so that folks know it's not 
far away from being dead code.


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@30
PS7, Line 30: are still used in two cases: 1) top-leel conditions in the
nit: level


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@45
PS7, Line 45: Still, the fix provides most of what the JIRA ticket requested
Does the specific "compute stats" query that's underlying the JIRA get faster 
in your testing?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@54
PS7, Line 54:
nit: it's weird that there are two spaces there.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@66
PS7, Line 66: // then want to allow CASE to do any simplification 
(reverting to
nit: The parenthetical here isn't closed.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@70
PS7, Line 70:   Expr revised =  rewriteConditionalFn((FunctionCallExpr) 
expr);
nit: should only have one space after =


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@72
PS7, Line 72:   // Workaround for IMALA-TBD
Is TBD resolved at this point?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@132
PS7, Line 132:* IFNULL(a, x) --> 
If we're not using HTML, is the  here noise?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@179
PS7, Line 179:   if (childExpr.isAggregate()) { sawAgg = true; }
The javadoc for "expr.contains()" suggests that hasAgg has already checked all 
the children, so it shouldn't be possible that hasAgg = false but sawAgg = true.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@50
PS7, Line 50:  * We can't eliminate aggregates as this may change the meaning 
of the
This is the same as line 70-72. Is that intentional?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
File fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@42
PS7, Line 42: System.out.println(stmt);
remove


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@97
PS7, Line 97:   @Test
:   public void testSqlRecovery() {
: ParseNode node = analyzeExpr("true and false");
: System.out.println(node.toSql());
: System.out.println(((SelectStmt) node).toSql(true));
: // SELECT TRUE AND FALSE FROM functional.alltypessmall
:   }
There's no assertion here...


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@117
PS7, Line 117: // TODO: IMPALA-7766
nit: including the subject of the JIRA here may make it obvious why...


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@128
PS7, Line 128:   public void testIf() throws ImpalaException {
Should we be testing the boring case:

if(a, b, c) => case when a then b else c

for columns a, b, c. (As opposed to aggregations or constants.)

Update: I think this is covered in the next test file.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1200/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:36:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7765: [DOCS] Document IMPALA MAX MEM ESTIMATE FOR ADMISSION option

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11804 )

Change subject: IMPALA-7765: [DOCS] Document 
IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/124/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
Gerrit-Change-Number: 11804
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:31:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7765: [DOCS] Document IMPALA MAX MEM ESTIMATE FOR ADMISSION option

2018-10-29 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11804 )

Change subject: IMPALA-7765: [DOCS] Document 
IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option
..


Patch Set 3:

One clarification per Tim's comment on the MEM_LIMIT option requirement.


--
To view, visit http://gerrit.cloudera.org:8080/11804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
Gerrit-Change-Number: 11804
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:30:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7765: [DOCS] Document IMPALA MAX MEM ESTIMATE FOR ADMISSION option

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11804 )

Change subject: IMPALA-7765: [DOCS] Document 
IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/124/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
Gerrit-Change-Number: 11804
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:29:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7765: [DOCS] Document IMPALA MAX MEM ESTIMATE FOR ADMISSION option

2018-10-29 Thread Alex Rodoni (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11804

to look at the new patch set (#3).

Change subject: IMPALA-7765: [DOCS] Document 
IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option
..

IMPALA-7765: [DOCS] Document IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option

Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
---
M docs/impala.ditamap
A docs/topics/impala_max_mem_estimate_for_admission.xml
2 files changed, 90 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/11804/3
--
To view, visit http://gerrit.cloudera.org:8080/11804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
Gerrit-Change-Number: 11804
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11790 )

Change subject: IMPALA-7743: [DOCS] A new option to load incremental statistics 
from catalog
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/123/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fd9b88138350406065df2f39a48043178759949
Gerrit-Change-Number: 11790
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:27:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11800 )

Change subject: IMPALA-6374: fix handling of commas in .test files
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1199/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:27:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1198/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:19:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 2:

(5 comments)

Thanks Thomas and Csaba for reviews so far

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20
PS2, Line 20: By default RleEncoder will now use run length encoding for runs of
: length 24 for single bit values, and of length 16 for 2 bit wide 
values.
: All other bit widths will use the existing length 8 runs.
> >This could be avoided by using smaller
I agree that this might be a better way, but I am happy with the simple 
improvement we get from this change.


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@266
PS3, Line 266:  1 byte for the repeated value
> I prefer to phrase it as "never worse than 8". For alternating long repeate
Done


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@267
PS3, Line 267:
> Maybe "better than 16" would be better.
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@203
PS2, Line 203: the previous run is flushed out
> Done
Done for real this time


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc@216
PS3, Line 216: c
> Oops, I missed this one:
Done



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:14:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7761: Add multiple DISTINCT to targeted stress and perf

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11805 )

Change subject: IMPALA-7761: Add multiple DISTINCT to targeted stress and perf
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I400aaf6b6620b4001895eafff785956bffb312c9
Gerrit-Change-Number: 11805
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:12:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 500 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/4
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7166: ExecSummary should be a first class object.

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11555 )

Change subject: IMPALA-7166: ExecSummary should be a first class object.
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3375/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Gerrit-Change-Number: 11555
Gerrit-PatchSet: 4
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:06:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7166: ExecSummary should be a first class object.

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11555 )

Change subject: IMPALA-7166: ExecSummary should be a first class object.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Gerrit-Change-Number: 11555
Gerrit-PatchSet: 3
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:06:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7166: ExecSummary should be a first class object.

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11555 )

Change subject: IMPALA-7166: ExecSummary should be a first class object.
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4791237a5579f16c9efda8e57876d48980739e13
Gerrit-Change-Number: 11555
Gerrit-PatchSet: 4
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:06:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5050: Add support to read TIMESTAMP MILLIS and TIMESTAMP MICROS from Parquet

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11057 )

Change subject: IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
..


Patch Set 18:

Hi Csaba, should I look at this now or is Zoltan still completing his review?


--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:05:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

2018-10-29 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11790 )

Change subject: IMPALA-7743: [DOCS] A new option to load incremental statistics 
from catalog
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml@1425
PS2, Line 1425: 3.0 and lower, approximately
  : 400 bytes of metadata per column per partition are 
needed for caching.
  : Tables with a big number of partitions and many columns 
can add up to a
  : significant memory overhead as the metadata must be c
> Reworded.
LGTM



--
To view, visit http://gerrit.cloudera.org:8080/11790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fd9b88138350406065df2f39a48043178759949
Gerrit-Change-Number: 11790
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:02:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#7).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 837 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/7
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/6/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/6/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@93
PS6, Line 93: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:56:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#6).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 836 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/6
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

2018-10-29 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11790 )

Change subject: IMPALA-7743: [DOCS] A new option to load incremental statistics 
from catalog
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml@1425
PS2, Line 1425: 3.0 and lower, for a table
  : with a big number of partitions and many columns, 
approximately 400
  : bytes of metadata per column per partition, can add up 
to a significant
  : memory overhead as the metadata must be cached on the
> Could we make this a bit more clear and flow better?  Something like (some
Reworded.



--
To view, visit http://gerrit.cloudera.org:8080/11790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fd9b88138350406065df2f39a48043178759949
Gerrit-Change-Number: 11790
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:54:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

2018-10-29 Thread Alex Rodoni (Code Review)
Hello Greg Rahn, Adrian Ng, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11790

to look at the new patch set (#3).

Change subject: IMPALA-7743: [DOCS] A new option to load incremental statistics 
from catalog
..

IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

--pull_incremental_statistics described in the Incremental Stats section.

Change-Id: I8fd9b88138350406065df2f39a48043178759949
---
M docs/shared/impala_common.xml
M docs/topics/impala_perf_stats.xml
2 files changed, 74 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/11790/3
--
To view, visit http://gerrit.cloudera.org:8080/11790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8fd9b88138350406065df2f39a48043178759949
Gerrit-Change-Number: 11790
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6374: fix handling of commas in .test files

2018-10-29 Thread Tim Armstrong (Code Review)
Hello Michael Brown, David Knupp, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11800

to look at the new patch set (#3).

Change subject: IMPALA-6374: fix handling of commas in .test files
..

IMPALA-6374: fix handling of commas in .test files

The .test file parser implemented an unconventional method for parsing
single-quoted strings in comma-separated value format. This didn't handle
trailing commas in the string correctly.

This commit switches to using a conventional method for parsing
comma-separated value format:
* Commas enclosed by single quotes are not treated as field separators
* Single quotes can be escaped within a string by doubling them.

I looked into using Python's .csv module for this, but it wouldn't
work without modifying the test file format more because it
automatically discards the quotes during parsing, which are actually
semantically important in .test files. E.g. without the quotes we can't
distinguish between the literal string 'regex:...' and the regex
regex:

Testing:
Ran exhaustive tests and fixed .test files that required modifications.
Will rerun before merging.

Added a couple of tests to exercise edge cases in the test file parser.

Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
---
M testdata/workloads/functional-query/queries/QueryTest/functions-ddl.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M testdata/workloads/functional-query/queries/QueryTest/hbase-inserts.test
M testdata/workloads/functional-query/queries/QueryTest/misc.test
A testdata/workloads/functional-query/queries/QueryTest/special-strings.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test
M testdata/workloads/tpcds/queries/tpcds-q98.test
M tests/common/test_result_verifier.py
M tests/query_test/test_exprs.py
10 files changed, 157 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/11800/3
--
To view, visit http://gerrit.cloudera.org:8080/11800
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9
Gerrit-Change-Number: 11800
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

Thanks everyone for your patient reviews of this change. Per PhilZ, removed the 
per-function optimizations, relying on existing rewrite rules (bugs and all). 
Added tests to verify that the "downstream" rules work (when they do.)

Per Vuk, reduced the scope to only rewrite the requested functions, and to 
retain the BE implementation; living with the bugs that prevent the rewrites 
from being done some of the time.

Added more tests, including tests of the full rewrite process to ensure that 
the conditional functions are, in fact, first rewritten to use CASE, then CASE 
is further simplified. Tests include commented-out cases to record where we 
decided to live with existing bugs.

The result is the minimal change that accomplishes the request in the JIRA 
without breaking anything.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:51:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Make UBSAN-friendly arithmetic generic

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11810 )

Change subject: IMPALA-5031: Make UBSAN-friendly arithmetic generic
..


Patch Set 1: Code-Review+2

(1 comment)

Just had a comment about comments - it feels a little cryptic to me (although 
may be obvious for some people).

http://gerrit.cloudera.org:8080/#/c/11810/1/be/src/util/arithmetic-util.h
File be/src/util/arithmetic-util.h:

http://gerrit.cloudera.org:8080/#/c/11810/1/be/src/util/arithmetic-util.h@116
PS1, Line 116:   enum class Ring { INTEGER, FLOAT, NEITHER };
I feel like there's some context missing here. I guess it's to enable generic 
logic over integral and floating point types, which are both support 
multiplication and addition but have some subtle differences.



--
To view, visit http://gerrit.cloudera.org:8080/11810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73bec71e59c5a921003d0ebca52a1d4e49bbef66
Gerrit-Change-Number: 11810
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:49:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@66
PS5, Line 66:   public Expr RewritesOk(String tableName, String exprStr, 
ExprRewriteRule rule, String expectedExprStr)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@71
PS5, Line 71:   public Expr RewritesOk(String exprStr, List 
rules, String expectedExprStr)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@88
PS5, Line 88:   public Expr RewritesOkWhereExpr(String exprStr, ExprRewriteRule 
rule, String expectedExprStr)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@90
PS5, Line 90: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@93
PS5, Line 93:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, ExprRewriteRule rule, String expectedExprStr)
line too long (113 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@95
PS5, Line 95: return RewritesOkWhereExpr(tableName, exprStr, 
Lists.newArrayList(rule), expectedExprStr);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@98
PS5, Line 98:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, List rules,
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:46:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#5).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 831 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/5
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..

IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

Prior to this change, the AggregationNode's perInstanceCardinality
was influenced by the node's selectivity and limit. This was
incorrect because the hash table is constructed over the entire
input stream before any row batches are produced. This change
ensures that the input cardinality is used to determine the
perInstanceCardinality.

Testing:
Added a planner test which ensures that an AggregationNode with a
limit estimates memory based on the input cardinality.
Ran front-end and end-to-end tests affected by this change.

Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Reviewed-on: http://gerrit.cloudera.org:8080/11806
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
3 files changed, 99 insertions(+), 21 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 4
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:40:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 1: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11816/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11816/1//COMMIT_MSG@9
PS1, Line 9: This patch adds the queuing status, that is, whether the query was
Can you mention that this is showed when either live_summary or live_progress 
is true.


http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift
File common/thrift/ExecStats.thrift:

http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift@104
PS1, Line 104: queued
"is currently queued by admission control" to be totally clear


http://gerrit.cloudera.org:8080/#/c/11816/1/common/thrift/ExecStats.thrift@107
PS1, Line 107:   // Contains the latest queuing reason if the query is queued.
Same as above.


http://gerrit.cloudera.org:8080/#/c/11816/1/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/11816/1/shell/impala_shell.py@969
PS1, Line 969:   if summary.is_queued:
It's a little subtle that this handles both None and False. I'll leave it up to 
you whether this is worth documenting.



--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:40:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3373/


--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 8
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:01:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1197/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 19:47:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py
File tests/custom_cluster/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@38
PS1, Line 38: proc = pexpect.spawn(' '.join([SHELL_CMD, "-i localhost:21000"]))
I think it would help connecting to another port (i.e. another impalad) to 
ensure that the admission controller status is propagated across impalads. (Not 
sure if it'd make things flaky though).



--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 19:24:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 3: Code-Review+1

Actually I should let michael take a final look too if he wants.


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 19:17:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 3: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11778/3/be/src/util/error-util.cc
File be/src/util/error-util.cc:

http://gerrit.cloudera.org:8080/#/c/11778/3/be/src/util/error-util.cc@231
PS3, Line 231:   DCHECK(false) << "Unexpected hs2Code: " << hs2Code;
clang-tidy failed because this didn't return a value.


http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test:

http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test@2203
PS2, Line 2203: set mem_limit=10m;
> Looks like we don't propagate debug actions to child queries?
Ah interesting. Maybe consider setting a lower mem_limit just to be sure that 
the error is reliable? The minimum memory limit can change depending on the 
cluster config and as we make changes to the code.



--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 19:17:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Bharath Vissapragada (Code Review)
Hello Michael Ho, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11778

to look at the new patch set (#3).

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..

IMPALA-7727: Fix TStatusCode to TErrorCode mapping

- Uses a "GENERAL" TErrorCode type for all non-OK statuses.
- Detailed regression root cause description in the jira IMPALA-7727.
- Added a regression test.

Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/service/child-query.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
6 files changed, 35 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/11778/3
--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1196/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 19:07:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: memcpy cannot take null arguments

2018-10-29 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11812 )

Change subject: IMPALA-5031: memcpy cannot take null arguments
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11812/1/be/src/runtime/string-buffer.h
File be/src/runtime/string-buffer.h:

http://gerrit.cloudera.org:8080/#/c/11812/1/be/src/runtime/string-buffer.h@93
PS1, Line 93: memcpy
I see new_buffer is guarded here. Is this one not replaced bc buffer_ is 
assumed to not be null here? Haven't looked too closely,  but it looks like 
buffer_ could be null.



--
To view, visit http://gerrit.cloudera.org:8080/11812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9acc8c32409e67253a987eb3d1fd7d921efcb51
Gerrit-Change-Number: 11812
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 18:57:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11778/2/be/src/common/status.cc
File be/src/common/status.cc:

http://gerrit.cloudera.org:8080/#/c/11778/2/be/src/common/status.cc@a172
PS2, Line 172:
> You missed removing the declaration in the header.
oops will do.


http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test:

http://gerrit.cloudera.org:8080/#/c/11778/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test@2203
PS2, Line 2203: set mem_limit=10m;
> Can you use a debug_action to force the error directly? That would be more
Looks like we don't propagate debug actions to child queries?

void ChildQuery::SetQueryOptions(const TQueryOptions& parent_options,
TExecuteStatementReq* exec_stmt_req) {
  map conf;
#define QUERY_OPT_FN(NAME, ENUM, LEVEL)\
  if (parent_options.__isset.NAME) {\
stringstream val;\
val << parent_options.NAME;\
conf[#ENUM] = val.str();\
  }
#define REMOVED_QUERY_OPT_FN(NAME, ENUM)
  QUERY_OPTS_TABLE
#undef QUERY_OPT_FN
#undef REMOVED_QUERY_OPT_FN
  // Ignore debug actions on child queries because they may cause deadlock.
  map::iterator it = conf.find("DEBUG_ACTION");
  if (it != conf.end()) conf.erase(it);
  exec_stmt_req->__set_confOverlay(conf);
}



--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 18:42:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11816 )

Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py
File tests/custom_cluster/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@26
PS1, Line 26: class TestShellInteractive(CustomClusterTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/11816/1/tests/custom_cluster/test_shell_interactive.py@29
PS1, Line 29:
flake8: E251 unexpected spaces around keyword / parameter equals



--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 18:33:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

2018-10-29 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11816


Change subject: IMPALA-7735: Expose queuing status in ExecSummary and 
impala-shell
..

IMPALA-7735: Expose queuing status in ExecSummary and impala-shell

This patch adds the queuing status, that is, whether the query was
queued and what was the latest queuing reason, to the ExecSummary.
Also added changes to allow impala-shell to expose this status by
pulling it out from the ExecSummary when either live_summary or
live_progress is set to true.

Testing:
Added custom cluster tests.

Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
---
M be/src/service/impala-server.cc
M common/thrift/ExecStats.thrift
M shell/impala_shell.py
M tests/custom_cluster/test_admission_controller.py
A tests/custom_cluster/test_shell_interactive.py
5 files changed, 100 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/11816/1
--
To view, visit http://gerrit.cloudera.org:8080/11816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibde447b01559b9f0f3e970d4fa10f6ee4064bd49
Gerrit-Change-Number: 11816
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-7765: [DOCS] Document IMPALA MAX MEM ESTIMATE FOR ADMISSION option

2018-10-29 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11804 )

Change subject: IMPALA-7765: [DOCS] Document 
IMPALA_MAX_MEM_ESTIMATE_FOR_ADMISSION option
..


Patch Set 2: Code-Review+2

(1 comment)

Looks good. Just another small nit

http://gerrit.cloudera.org:8080/#/c/11804/2/docs/topics/impala_max_mem_estimate_for_admission.xml
File docs/topics/impala_max_mem_estimate_for_admission.xml:

http://gerrit.cloudera.org:8080/#/c/11804/2/docs/topics/impala_max_mem_estimate_for_admission.xml@45
PS2, Line 45: on the memory estimates of a query
on the planner's memory estimates for a query



--
To view, visit http://gerrit.cloudera.org:8080/11804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibef89c98530c6974dc791666cc51c1ded52e7910
Gerrit-Change-Number: 11804
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Oct 2018 17:29:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1195/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 17:17:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7743: [DOCS] A new option to load incremental statistics from catalog

2018-10-29 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11790 )

Change subject: IMPALA-7743: [DOCS] A new option to load incremental statistics 
from catalog
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/11790/2/docs/shared/impala_common.xml@1425
PS2, Line 1425: 3.0 and lower, for a table
  : with a big number of partitions and many columns, 
approximately 400
  : bytes of metadata per column per partition, can add up 
to a significant
  : memory overhead as the metadata must be cached on the
Could we make this a bit more clear and flow better?  Something like (some 
parts left out for brevity):
in 3.0 and lower approx 400 bytes is needed per column, per, partition, for 
caching.  As a result, tables with a big number of partitions and many columns 
can add up to...



--
To view, visit http://gerrit.cloudera.org:8080/11790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fd9b88138350406065df2f39a48043178759949
Gerrit-Change-Number: 11790
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adrian Ng (389)
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 17:08:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:45:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3374/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:45:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:44:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11806/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11806/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-7749: Compute AggregationNode's memory estimate using input 
cardinality
> I think it would be better to file a separate JIRA for part 2. Part 1 is ki
Done


http://gerrit.cloudera.org:8080/#/c/11806/1/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test:

http://gerrit.cloudera.org:8080/#/c/11806/1/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test@5473
PS1, Line 5473: # IMPALA-7749: Aggregation with a limit should compute memory 
estimates based on input
> Probably worth mentioning the JIRA for motivation
Done



--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:42:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

2018-10-29 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11806 )

Change subject: IMPALA-7749: Compute AggregationNode's memory estimate using 
input cardinality
..

IMPALA-7749: Compute AggregationNode's memory estimate using input cardinality

Prior to this change, the AggregationNode's perInstanceCardinality
was influenced by the node's selectivity and limit. This was
incorrect because the hash table is constructed over the entire
input stream before any row batches are produced. This change
ensures that the input cardinality is used to determine the
perInstanceCardinality.

Testing:
Added a planner test which ensures that an AggregationNode with a
limit estimates memory based on the input cardinality.
Ran front-end and end-to-end tests affected by this change.

Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
3 files changed, 99 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/11806/2
--
To view, visit http://gerrit.cloudera.org:8080/11806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifd95d2ad5b677fca459c9c32b98f6176842161fc
Gerrit-Change-Number: 11806
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7760: Privilege version inconsistency causes a hang when running invalidate metadata

2018-10-29 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11794 )

Change subject: IMPALA-7760: Privilege version inconsistency causes a hang when 
running invalidate metadata
..


Patch Set 4:

> Patch Set 4:
>
> > Yes, I created some follow-up JIRAs for this.
>  > a) https://issues.apache.org/jira/browse/IMPALA-7764
>
> This was very generic ("add unit tests.") We have a specific case here that 
> we need to add to our tests, which involves (if I'm reading things right) 
> removing the last permission from Sentry. That seems sensible to do as an 
> end-to-end test (in custom cluster?).
>
> Anyway--are we working on this right now?
>
>  > b) https://issues.apache.org/jira/browse/IMPALA-7762
>
> Thanks.

It was a subtle bug and it was difficult to write an end-to-end test to 
reproduce this because we need to get the timing right. I did some end-to-end 
test for a similar issue but it was easier to reproduce: 
https://github.com/apache/impala/blob/master/tests/authorization/test_grant_revoke.py#L373-L393
Anyway, yes, I'm working on beefing up tests related to SentryProxy (which 
includes unit tests and end-to-end tests). I'll rename the subject in JIRA to 
make it clear.


--
To view, visit http://gerrit.cloudera.org:8080/11794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1e0db2b1f727476f489c732c4f4e5bc1582429f
Gerrit-Change-Number: 11794
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:22:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7760: Privilege version inconsistency causes a hang when running invalidate metadata

2018-10-29 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11794 )

Change subject: IMPALA-7760: Privilege version inconsistency causes a hang when 
running invalidate metadata
..


Patch Set 4:

> Yes, I created some follow-up JIRAs for this.
 > a) https://issues.apache.org/jira/browse/IMPALA-7764

This was very generic ("add unit tests.") We have a specific case here that we 
need to add to our tests, which involves (if I'm reading things right) removing 
the last permission from Sentry. That seems sensible to do as an end-to-end 
test (in custom cluster?).

Anyway--are we working on this right now?

 > b) https://issues.apache.org/jira/browse/IMPALA-7762

Thanks.


--
To view, visit http://gerrit.cloudera.org:8080/11794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1e0db2b1f727476f489c732c4f4e5bc1582429f
Gerrit-Change-Number: 11794
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:16:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7760: Privilege version inconsistency causes a hang when running invalidate metadata

2018-10-29 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11794 )

Change subject: IMPALA-7760: Privilege version inconsistency causes a hang when 
running invalidate metadata
..


Patch Set 4:

> Patch Set 4:
>
> Apologies for the late review:
>
> a) Can we add a test for this?
>
> b) Where was the hang? It seems like we should fail queries if they're 
> waiting for catalog updates past some threshold (5 minutes?) rather than hang 
> forever, since hanging forever is so much less pleasant to figure out.
> 
> Thanks!

Yes, I created some follow-up JIRAs for this.
a) https://issues.apache.org/jira/browse/IMPALA-7764
b) https://issues.apache.org/jira/browse/IMPALA-7762


--
To view, visit http://gerrit.cloudera.org:8080/11794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1e0db2b1f727476f489c732c4f4e5bc1582429f
Gerrit-Change-Number: 11794
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:14:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7760: Privilege version inconsistency causes a hang when running invalidate metadata

2018-10-29 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11794 )

Change subject: IMPALA-7760: Privilege version inconsistency causes a hang when 
running invalidate metadata
..


Patch Set 4:

Apologies for the late review:

a) Can we add a test for this?

b) Where was the hang? It seems like we should fail queries if they're waiting 
for catalog updates past some threshold (5 minutes?) rather than hang forever, 
since hanging forever is so much less pleasant to figure out.

Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/11794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1e0db2b1f727476f489c732c4f4e5bc1582429f
Gerrit-Change-Number: 11794
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:10:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 8
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:06:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3373/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 8
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:06:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@313
PS3, Line 313:   long limit, long offset, boolean hasLimit, boolean 
disableTopN)
> After discussing this a bit more with Tim, it seems that we generally avoid
Sahil covered this, but just wanted to confirm that there was a deliberate 
decision not to pursue, in general, making planner decisions to fit a query 
into a memory limit. We do give up some opportunities to make more optimal 
decisions but it's definitely simpler for now.

I don't recall seeing a case in practice where Top-N memory consumption caused 
a query failure but it seems like something that will happen eventually, and 
this change helps avoid it or at least gives an escape hatch that doesn't 
require rewriting the query.



--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 16:02:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1194/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 15:31:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11698/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/11698/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@337
PS5, Line 337: limit + offset
> Sounds like we just should be failing queries where offset + limit is > THR
Filed IMPALA- as a follow up and removed the test changes for handling 
overflows.



--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 14:58:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5004: Switch to sorting node for large TopN queries

2018-10-29 Thread Sahil Takiar (Code Review)
Hello Lars Volker, Paul Rogers, Tim Armstrong, Impala Public Jenkins, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11698

to look at the new patch set (#7).

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
..

IMPALA-5004: Switch to sorting node for large TopN queries

Adds a new query option 'topn_bytes_limit' that places a limit on the
number of estimated bytes that a TopN operator can process. If the
Impala planner estimates that a TopN operator will process more bytes
than this limit, it will replace the TopN operator with a sort operator.

Since the TopN operator cannot spill to disk, it has to buffer everything
in memory. This can cause frequent OOM issues when running with a large
limit + offset. Switching to a sort operator allows Impala to spill to
disk. We prefer to use the TopN operator when possible as it has better
performance than the sort operator for 'order by limit [offset]' queries.

The default limit is set to 512MB and is based on micro-benchmarking the
topn vs. sort operator for various limits (see the JIRA for full details).
The default is set to an intentionally high value in order to avoid
performance regressions.

If the planner thinks the TopN operator will end up processing the
entire input dataset, it falls back to a Sort operator. Since TopN will
just end up sorting the entire dataset, there is no point in using it
vs. the Sort operator (which has the advantage that it can spill to
disk).

Testing:

* Added a new planner test to fuctional-planner/ to validate that
'topn_bytes_limit' properly switches between topn and sort operators.
* Added a new planner test to functional-planner/ to validate that
running an 'order by limit x' where x is the size of the input, triggers
a sort instead of a TopN (even if the 'topn_bytes_limit' threshold has
not been exceeded)

Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/topn-bytes-limit-small.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/topn-bytes-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
25 files changed, 260 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/11698/7
--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac