[Impala-ASF-CR] IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

2016-10-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
..


IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

IMPALA-3002:
The shell prints an incorrect value for '#Rows' in the exec
summary for broadcast nodes due to incorrect logic around
whether to use max or agg stats. This patch makes the behavior
consistent with the way the be treats exec summaries in
summary-util.cc. This incorrect logic was also duplicated in
the impala_beeswax test framework.

IMPALA-1473:
When there is a merging exchange with a limit, we may copy rows
into the output batch beyond the limit. In this case, we currently
update the output batch's size to reflect the limit, but we also
need to update ExecNode::num_rows_returned_ or the exec summary
may show that the exchange node returned more rows than it really
did.

Additionally, PlanFragmentExecutor::GetNext does not update
rows_produced_counter_ in some cases, leading the runtime profile
to display an incorrect value for 'RowsProduced'.

Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Reviewed-on: http://gerrit.cloudera.org:8080/4679
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/exec/exchange-node.cc
M be/src/runtime/plan-fragment-executor.cc
M shell/impala_client.py
M tests/beeswax/impala_beeswax.py
A tests/query_test/test_observability.py
5 files changed, 63 insertions(+), 7 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

2016-10-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

2016-10-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

2016-10-14 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4679/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPA
> nit: can you separate this into separate IMPALA-1473, sometimes ppl are gre
Done


http://gerrit.cloudera.org:8080/#/c/4679/2/shell/impala_client.py
File shell/impala_client.py:

PS2, Line 145: # is the max over all
> I think you need to update impala_beeswax.py which duplicates this code (un
Done


http://gerrit.cloudera.org:8080/#/c/4679/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

PS2, Line 25: test_merge_exchange_num_rows(s
> test_merge_exchange_num_rows
Done


PS2, Line 26: IMPALA-1473 
> I think this tests both 1473 and IMPALA-3002. Can you verify that with a pr
It does actually exercise the relevant code for IMPALA-3002 (in impala_beeswax 
at least), but the result is the same with or without the fix for this 
particular query, so I added another test.


-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

2016-10-14 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#3).

Change subject: IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
..

IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

IMPALA-3002:
The shell prints an incorrect value for '#Rows' in the exec
summary for broadcast nodes due to incorrect logic around
whether to use max or agg stats. This patch makes the behavior
consistent with the way the be treats exec summaries in
summary-util.cc. This incorrect logic was also duplicated in
the impala_beeswax test framework.

IMPALA-1473:
When there is a merging exchange with a limit, we may copy rows
into the output batch beyond the limit. In this case, we currently
update the output batch's size to reflect the limit, but we also
need to update ExecNode::num_rows_returned_ or the exec summary
may show that the exchange node returned more rows than it really
did.

Additionally, PlanFragmentExecutor::GetNext does not update
rows_produced_counter_ in some cases, leading the runtime profile
to display an incorrect value for 'RowsProduced'.

Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
---
M be/src/exec/exchange-node.cc
M be/src/runtime/plan-fragment-executor.cc
M shell/impala_client.py
M tests/beeswax/impala_beeswax.py
A tests/query_test/test_observability.py
5 files changed, 63 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4679/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall