[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 22:05:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Reviewed-on: http://gerrit.cloudera.org:8080/14234
Reviewed-by: Sahil Takiar 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 108 insertions(+), 20 deletions(-)

Approvals:
  Sahil Takiar: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4629/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 18:06:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4994/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 17:56:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 6: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 17:56:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 6:

(4 comments)

Rebase and fix the tests.

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676:
> So does 'Probes:.*\((\d+)\)', positive look-behinds are useful when you nee
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@677
PS3, Line 677: u
> Done
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@688
PS3, Line 688: ry needs load
> Done
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@691
PS3, Line 691: l
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 17:28:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14234

to look at the new patch set (#6).

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 108 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/6
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 5: Code-Review+2

Oh and looks like this needs to be rebased because there is a merge conflict.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 15:03:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(2 comments)

LGTM pending the test updates / doc updates.

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> I do not know the code well, but from what I read, When spill or serializes
I'm not that familiar with this code either, but that makes sense. Would be 
good to make that a little clearer in the comments. Otherwise, LGTM.


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
> I feel my approach is more strict, it requires the number must directly fol
So does 'Probes:.*\((\d+)\)', positive look-behinds are useful when you need to 
use a regex to match a string that precedes another regex - 
https://www.regular-expressions.info/lookaround.html

In this case, it is not necessary, because you can exactly match the entire 
string that precedes the numeric values, 'Probes: '.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 15:03:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 5:

(1 comment)

> Patch Set 3:
>
> (2 comments)

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
> The point I am trying to make here is that your regexes are overly complica
I feel my approach is more strict, it requires the number must directly follow 
"Probes:  ".  But your way is fine too, for our profile should not have 
ill-formatted data.  I will fix my tests.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 14:48:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-24 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651:
> It looks like the comment you added was "This method should only be called
I do not know the code well, but from what I read, When spill or 
serializestreamforspill happens, there is not enough memory for hashtable, so 
the partition is spilled and hashtable is closed. Under this condition, this 
hashtable is not used by the query at all, and the stats should be all 
0(related to operations) or not relevant. So we should not add them to the 
query profile. And some other Close happens under error conditions when 
creating hashtable which should be added either.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 24 Sep 2019 14:43:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-23 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> But not all the close of hashtable should trigger the counters adding. I ad
It looks like the comment you added was "This method should only be called once 
for each HashTable and be called during closing the owner object of the 
HashTable". Still not following why this this is the case, why does it have to 
be called when the owner object is closed vs. when the HashTable is closed? I 
think the current implementation actually has a bug where the counters are not 
always properly updated. Looking at 
GroupingAggregator::Partition::SerializeStreamForSpilling, there are places 
where the HashTable is closed and reset, but no corresponding call to 
StatsCountersAdd, so any counters from that hash table look like they are lost.

Might be easier to discuss this in person at this point.


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
> The form of this counter is:
The point I am trying to make here is that your regexes are overly complicated, 
there is no need for the positive lookahead. You don't need to bother parsing 
the '.' part of the counter, because the value in the parenthesis always shows 
the absolute value. I re-wrote this part of the test and confirmed that it 
works locally:

 nprobes = re.search('Probes:.*\((\d+)\)', runtime_profile)
 assert nprobes and len(nprobes.groups()) == 1 and nprobes.group(1) >= 0
 ntravel = re.search('Travel:.*\((\d+)\)', runtime_profile)
 assert ntravel and len(ntravel.groups()) == 1 and ntravel.group(1) >= 0



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 23 Sep 2019 22:14:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4617/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 23 Sep 2019 21:20:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-23 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 5:

(2 comments)

patch 5 fixed style issues.

http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h@650
PS4, Line 650: arent_profile
> nit: parent_profile
Done


http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h@650
PS4, Line 650:   RuntimeProfile* parent_profile);
> nit: the formatting convention puts the open parenthesis on the previous li
Done



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 23 Sep 2019 20:40:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-23 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14234

to look at the new patch set (#5).

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 108 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/5
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 4: Code-Review+1

(2 comments)

Couple of nits, but i'm happy with this, will be useful for debugging things.

Thanks for extending to the aggregations, good to fill in that observability 
gap too.

http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h@650
PS4, Line 650:   (RuntimeProfile* parentProfile);
nit: the formatting convention puts the open parenthesis on the previous line, 
i.e.

  AddHashtableCounters(
 ...)


http://gerrit.cloudera.org:8080/#/c/14234/4/be/src/exec/hash-table.h@650
PS4, Line 650: parentProfile
nit: parent_profile



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 23 Sep 2019 19:12:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4604/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 20 Sep 2019 14:06:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-20 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(6 comments)

Patch 4 addresses review issues.

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@643
PS3, Line 643: // Create profile and counters for HashTable stats and add to 
parent profile.
 :   /// Returns a HashTableStatsProfile object.
> would be good to document that this actually makes a child profile called "
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@645
PS3, Line 645: CreateAndAddToProfile
> I'd suggest returning a unique_ptr, and using an unique_ptr to store it - t
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> Not sure I follow, once the HashTable is closed, the internal counters are
But not all the close of hashtable should trigger the counters adding. I add 
more comments to explain the related case.


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@677
PS3, Line 677: a
> Done
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@688
PS3, Line 688: execute_query
> Done
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@691
PS3, Line 691:
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 20 Sep 2019 13:24:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-20 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14234

to look at the new patch set (#4).

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 108 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/4
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-19 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

> Patch Set 3:
>
> (2 comments)
Give you an example for integer only case:
Hash Table:
   - HashCollisions: 0 (0)
   - Probes: 2 (2)
   - Resizes: 0 (0)
   - Travel: 0 (0)


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 19 Sep 2019 20:01:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-19 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(5 comments)

> Patch Set 3:
>
> (2 comments)

. I need the number only to change to float, so I need the return to be 2.52 in 
your case.
. When the Probe is 0 or 18, there is no dot.

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@532
PS3, Line 532: HashTableStatsProfile
> what exactly is this struct suppose to encapsulate? would be good to add so
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@534
PS3, Line 534:   RuntimeProfile *hashtable_profile = nullptr;
> nit: RuntimeProfile* hasbtable_profile = nullptr;
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@677
PS3, Line 677: a
> nit: an
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@688
PS3, Line 688: execute_query
> should assert that the query was actually successful:
Done


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@691
PS3, Line 691:
> nit: extra space
Done



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 19 Sep 2019 19:51:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-19 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> It is accumulated counters, it can not be added in the close method. Some c
Not sure I follow, once the HashTable is closed, the internal counters are no 
longer updated right? We actually already call HashTable::PrintStats() in the 
close method and there is a TODO in Close() that says 'These statistics should 
go to the runtime profile as well.'


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
> I need to get the number only. For example 12 or 12.3. Your search seems wi
The form of this counter is:

 Probes: 2.52K (2520)

Right?

Then the regex:

 Probes:.*\((\d+)\)

Should work - https://regex101.com/r/NplYQs/1



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 19 Sep 2019 17:52:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-19 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(2 comments)

Answer some questions while I fix the review issues.

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> I think this kinda makes sense - we know the number of buckets earlier than
It is accumulated counters, it can not be added in the close method. Some close 
called from spill, for example, we did not add this counter at close for 
collisions number in the legacy code.


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
> is the positive lookahead in this regex necessary? would the following rege
I need to get the number only. For example 12 or 12.3. Your search seems will 
get back whole string "Probes:  ...12.3" and will get nothing for integer case.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 19 Sep 2019 13:06:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@645
PS3, Line 645: CreateAndAddToProfile
> nit: AddHashTableCounters
I'd suggest returning a unique_ptr, and using an unique_ptr to store it - that 
documents the memory ownership unambiguously and will avoid any risk of memory 
leaks being introduced later


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
> could all the counters just be updated in the Close method instead?
I think this kinda makes sense - we know the number of buckets earlier than we 
know the final values of the other stats.

It'd be good to mention in the method comments.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 19 Sep 2019 00:41:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@532
PS3, Line 532: HashTableStatsProfile
what exactly is this struct suppose to encapsulate? would be good to add some 
docs explaining what this struct is for. it looks like this tracks the stats of 
all hash tables created by a node, so it would be good to document the scope 
and lifecycle of this object.


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@534
PS3, Line 534:   RuntimeProfile *hashtable_profile = nullptr;
nit: RuntimeProfile* hasbtable_profile = nullptr;


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@643
PS3, Line 643: // Create profile and counters for HashTable stats and add to 
parent profile.
 :   /// Returns a HashTableStatsProfile object.
would be good to document that this actually makes a child profile called "Hash 
Table"


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@645
PS3, Line 645: CreateAndAddToProfile
nit: AddHashTableCounters


http://gerrit.cloudera.org:8080/#/c/14234/3/be/src/exec/hash-table.h@651
PS3, Line 651: StatsCountersAdd
could all the counters just be updated in the Close method instead?


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@676
PS3, Line 676: '(?<=Probes: )\d+(\.\d+)?'
is the positive lookahead in this regex necessary? would the following regex 
not achieve the same thing:

 Probes:.*(\.\d+)


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@677
PS3, Line 677: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@688
PS3, Line 688: execute_query
should assert that the query was actually successful:

 result = self.execute_query(query)
 assert result.success


http://gerrit.cloudera.org:8080/#/c/14234/3/tests/query_test/test_observability.py@691
PS3, Line 691:
nit: extra space



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 18 Sep 2019 22:00:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4585/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 18 Sep 2019 13:47:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4584/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 18 Sep 2019 13:27:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14234

to look at the new patch set (#3).

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 97 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/3
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 2:

(3 comments)

New patch refactoring code to address the review issues, add support for group 
aggregation profiling, add more check for tests and re-ran exhaustive tests

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/hash-table.h@741
PS1, Line 741:   /// Returns an iterator at the beginning of the hash table.  
Advancing this iterator
> nit: we usually use lower_case for trivial accessor functions (this is allo
Agree, number of buckets follows the pattern.  I removed these new methods for 
they are no longer needed.


http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc@595
PS1, Line 595: }
> Found the place in grouping-aggregator, I will put the similar code in a co
Done


http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py@675
PS1, Line 675: assert "Resizes:" in runtime_profile
 : nprobes = re.search('(?<=Probes: )\d+(\.\d+)?', 
runtime_profile)
 : # Probes and travel can be 0. The number can be a integer or 
float with K
 : assert float(nprobes.group(0)) >= 0
 : ntravel = re.search('(?<=Travel: )\d
> That's a good point. I think it would be easy with the way the code is stru
Add assert on values, I just found , at least in aggregation, every counter 
value can be 0.
Also, add the test for group by queries.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 18 Sep 2019 13:05:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14234/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/14234/2/be/src/exec/hash-table.cc@425
PS2, Line 425:   statsProfile->num_hash_collisions_ = 
ADD_COUNTER(hashtable_profile, "HashCollisions", TUnit::UNIT);
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/14234/2/be/src/exec/hash-table.cc@426
PS2, Line 426:   statsProfile->num_hash_buckets_ = 
ADD_COUNTER(hashtable_profile, "HashBuckets", TUnit::UNIT);
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/14234/2/be/src/exec/hash-table.cc@427
PS2, Line 427:   statsProfile->num_hash_resizes_ = 
ADD_COUNTER(hashtable_profile, "Resizes", TUnit::UNIT);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/14234/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/2/tests/query_test/test_observability.py@677
PS2, Line 677:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/14234/2/tests/query_test/test_observability.py@677
PS2, Line 677: # Probes and travel can be 0. The number can be a integer or 
float with K
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/14234/2/tests/query_test/test_observability.py@683
PS2, Line 683: "
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 18 Sep 2019 12:48:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-18 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14234

to look at the new patch set (#2).

Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
...
Runtime filters: 1 of 1 Runtime Filter Published
- BuildRowsPartitionTime: 157.960us
- BuildRowsPartitioned: 100 (100)
- HashTablesBuildTime: 298.817us
- LargestPartitionPercent: 7 (7)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 17.12 KB (17536)
- RepartitionTime: 0.000ns
- SpilledPartitions: 0 (0)
Hash Table:
- HashBuckets: 256 (256)
- HashCollisions: 0 (0)
- Probes: 2.52K (2520)
- Resizes: 0 (0)
- Travel: 1.79K (178

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
8 files changed, 91 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/2
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc@595
PS1, Line 595: void PhjBuilder::AddHashTableStatsToProfile(RuntimeProfile 
*profile) {
> The counters cannot be in the HashTable class. For each  PhjBuilder object,
Found the place in grouping-aggregator, I will put the similar code in a common 
method, it seems only HashTable possible.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 21:48:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc@595
PS1, Line 595: void PhjBuilder::AddHashTableStatsToProfile(RuntimeProfile 
*profile) {
> The HashTable class is also used in the hash aggregation (grouping-aggregat
The counters cannot be in the HashTable class. For each  PhjBuilder object, it 
can have more than one hash table. For counters and profile object are all in 
PhjBuilder,  it is more natural the method is in the class.  I will do more 
research on where to profile grouping-aggregator.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 20:03:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py@675
PS1, Line 675: assert "Hash Table" in runtime_profile
 : assert "Probes:" in runtime_profile
 : assert "Travel:" in runtime_profile
 : assert "HashCollisions:" in runtime_profile
 : assert "Resizes:" in runtime_profile
> would be good to validate that they have valid values - e.g. they are non-z
That's a good point. I think it would be easy with the way the code is 
structured for us to double-count some of these things by moving code around, 
so some kind of range check might be useful too (I'm fairly flexible on this 
though since we've generally not validated profile counters for pragmatic 
reasons).



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 17:38:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14234/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14234/1//COMMIT_MSG@17
PS1, Line 17:   Hash Join Builder (join_node_id=2):
: ExecOption: Codegen Disabled: disabled due to 
optimization hints
: Runtime filters: 1 of 1 Runtime Filter Published
:  - BuildRowsPartitionTime: 157.960us
:  - BuildRowsPartitioned: 100 (100)
:  - HashTablesBuildTime: 298.817us
:  - LargestPartitionPercent: 7 (7)
:  - MaxPartitionLevel: 0 (0)
:  - NumRepartitions: 0 (0)
:  - PartitionsCreated: 16 (16)
:  - PeakMemoryUsage: 17.12 KB (17536)
:  - RepartitionTime: 0.000ns
:  - SpilledPartitions: 0 (0)
: Hash Table:
:- HashBuckets: 256 (256)
:- HashCollisions: 0 (0)
:- Probes: 2.52K (2520)
:- Resizes: 0 (0)
:- Travel: 1.79K (1787)
nit: don't think this has to be indented so much, just two spaces is probably 
sufficient, plus the `ExecOption` line is exceeding the max width of the commit 
message.


http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/14234/1/tests/query_test/test_observability.py@675
PS1, Line 675: assert "Hash Table" in runtime_profile
 : assert "Probes:" in runtime_profile
 : assert "Travel:" in runtime_profile
 : assert "HashCollisions:" in runtime_profile
 : assert "Resizes:" in runtime_profile
would be good to validate that they have valid values - e.g. they are non-zero. 
not sure that would apply to all of these (e.g. resizes), but it would be good 
to add the validation for the ones that are non-zero.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 16:55:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

(2 comments)

Looks good, I had a couple of requests for cleanup that were problems with the 
pre-existing code.

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/hash-table.h@741
PS1, Line 741:   int64_t NumHashCollisions() const { return 
num_hash_collisions_; }
nit: we usually use lower_case for trivial accessor functions (this is allowed 
by the google style guide - see 
https://google.github.io/styleguide/cppguide.html#Function_Names)

This was a pre-existing pattern in this file, but I think now is a good time to 
make it consistent with the rest of the codebase.


http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14234/1/be/src/exec/partitioned-hash-join-builder.cc@595
PS1, Line 595: void PhjBuilder::AddHashTableStatsToProfile(RuntimeProfile 
*profile) {
The HashTable class is also used in the hash aggregation 
(grouping-aggregator.cc), so we should try to also include the stats there too.

I think it would make more sense to have helper methods in HashTable to create 
and update the profile counters, so that it can be easily reused.

It looks like this pattern of having counters in the hash join but not the agg 
was already there, but it would be really nice to improve it at this point.



--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 16:19:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14234 )

Change subject: IMPALA-7637: Add more hash table stats to profile
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4565/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 16 Sep 2019 14:00:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7637: Add more hash table stats to profile

2019-09-16 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14234


Change subject: IMPALA-7637: Add more hash table stats to profile
..

IMPALA-7637: Add more hash table stats to profile

Add hash table counters(probes, travel and resizes) to profile.
Put hash table stats into the child profile "hash table".

Tests:
Add new test test_query_profle_hashtable.
Ran exhaustive tests.

Profile Sample:
  Hash Join Builder (join_node_id=2):
ExecOption: Codegen Disabled: disabled due to optimization hints
Runtime filters: 1 of 1 Runtime Filter Published
 - BuildRowsPartitionTime: 157.960us
 - BuildRowsPartitioned: 100 (100)
 - HashTablesBuildTime: 298.817us
 - LargestPartitionPercent: 7 (7)
 - MaxPartitionLevel: 0 (0)
 - NumRepartitions: 0 (0)
 - PartitionsCreated: 16 (16)
 - PeakMemoryUsage: 17.12 KB (17536)
 - RepartitionTime: 0.000ns
 - SpilledPartitions: 0 (0)
Hash Table:
   - HashBuckets: 256 (256)
   - HashCollisions: 0 (0)
   - Probes: 2.52K (2520)
   - Resizes: 0 (0)
   - Travel: 1.79K (1787)

Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
---
M be/src/exec/hash-table.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M tests/query_test/test_observability.py
4 files changed, 48 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/14234/1
--
To view, visit http://gerrit.cloudera.org:8080/14234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1fd875dd1af8031242fd5f5ff554d3a71aaa6f87
Gerrit-Change-Number: 14234
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen