[Impala-ASF-CR] IMPALA-9381: lazy conversion of runtime profile

2020-02-24 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15236 )

Change subject: IMPALA-9381: lazy conversion of runtime profile
..


Patch Set 3: Code-Review+1

This is so good! I want to do the same when I wrote the runtimeprofile ToJson() 
code. But I was so lack of the code base and C++ knowledge so I couldn't find a 
way to do it. This is much more efficient!


--
To view, visit http://gerrit.cloudera.org:8080/15236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f5133cc146adc3b044cf4b64aae0a9688449fa
Gerrit-Change-Number: 15236
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Comment-Date: Mon, 24 Feb 2020 18:38:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-14 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 11:

Hi Tim,
Seems like there is a test failure 
JdbcTest.testConcurrentSessionMixedIdleTimeout:656

Not sure if this is a flaky test or something. Can we try to run the test again?


--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 11
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 14 Dec 2019 18:55:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-13 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc@54
PS8, Line 54: "The maximum number of bytes to display on the debug 
webserver's log page");
> Oh right, that happens if you have a declaration in a header, but no defini
Always good to learn knowledge like these. Thanks!



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 10
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 13 Dec 2019 20:13:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-13 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#10).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are also annotated with their significance to users.
* STABLE_HIGH - High level and stable counters, always useful for measuring
query performance and status. Counters that everyone is interested. should
rarely change and if it does we will make some effort to notify users.

* STABLE_LOW - Low level and stable counters. Interesting counters to monitor
 and analyze by machine. It will probably be interesting under some
 circumstances for users.

* Unstable - Unstable but useful. Useful to understand query performance,
but subject to change, particularly if the implementation changes.
E.g. MaterializeTupleTimer

* Debug -  Debugging counters. Generally not useful to users of Impala,
 the main use case is low-level debugging. Can be hidden to reduce noise
 for most consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 678 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/10
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 10
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-13 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 9:

(4 comments)

Hi Tim, thanks for the review!

http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc@54
PS8, Line 54: constexpr ProfileEntryPrototype::Significance 
ProfileEntryPrototype::ALLSIGNIFICANCE[];
> Include util/runtime-profile-counters.h instead of redeclaring it.
I have tried this way but it won't compile... The error is error: undefined 
reference to 'impala::ProfileEntryPrototype::ALLSIGNIFICANCE'

I am not sure if I did it wrong?


http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@109
PS8, Line 109: fo
> nit: for
Done


http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@115
PS8, Line 115: ";
> Maybe remove this sentence?
Done


http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@115
PS8, Line 115: circumstance
> nit: circumstances
Done



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 13 Dec 2019 20:04:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-13 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#9).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are also annotated with their significance to users.
* STABLE_HIGH - High level and stable counters, always useful for measuring
query performance and status. Counters that everyone is interested. should
rarely change and if it does we will make some effort to notify users.

* STABLE_LOW - Low level and stable counters. Interesting counters to monitor
 and analyze by machine. It will probably be interesting under some
 circumstances for users.

* Unstable - Unstable but useful. Useful to understand query performance,
but subject to change, particularly if the implementation changes.
E.g. MaterializeTupleTimer

* Debug -  Debugging counters. Generally not useful to users of Impala,
 the main use case is low-level debugging. Can be hidden to reduce noise
 for most consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 676 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/9
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-11 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#8).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are also annotated with their significance to users.
* STABLE_HIGH - High level and stable counters, always useful on measuring
query performance and status. Counters that everyone is interested. should
rarely change and if it does we will make some effort to notify users.

* STABLE_LOW - Low level and stable counters. Interesting counters to monitor
 and analyze by machine. It will probably be interesting under some
 circumstance for users. Lots of developers are interested.

* Unstable - Unstable but useful. Useful to understand query performance,
but subject to change, particularly if the implementation changes.
E.g. MaterializeTupleTimer

* Debug -  Debugging counters. Generally not useful to users of Impala,
 the main use case is low-level debugging. Can be hidden to reduce noise
 for most consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 676 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/8
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 8
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-11 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 7:

(4 comments)

Hi Tim, Balazs
Thanks for the review! Here is another round code changes.

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@72
PS5, Line 72:
: PROFILE_DEFINE_DERIVED_COUNTER(PerReadThreadRawHdfsThrou
> As mentioned for the previous counter, all 'total wall clock time' metrics
Done, also changed to the same pattern on TotalRawHBaseReadTime. Run "git grep" 
and did not find other places that have the same descriptions.


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85
PS5, Line 85: average nu
> Yep, there's a single queue (with multiple I/O threads consuming work from
Okay, then the original text makes sense here I think.


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@88
PS5, Line 88: ry in exchange no
> I was recommending '(across the network)' in parens specifically because it
Thanks for explanation. Done


http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl
File www/profile_docs.tmpl:

http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl@29
PS5, Line 29: Name
:   Significance
:   Unit
:   Description
: 
:   
:   
: {{#profile_docs}}
: 
:   
:   {{name}}
:   
:   
:   {{significance}}
:   
:   
:   {{unit}}
:   
:   
:   {{description}}
:   
: 
> IIUC, right now, the table will only have the significance level (e.g. STAB
Done. Add the section.
Screenshot here:
https://drive.google.com/open?id=1hu1tRe2fFwrb_CIkce_g7sflI7FVZTLk



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Dec 2019 23:18:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-09 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#7).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are also annotated with their significance to users.
* STABLE_HIGH - High level and stable counters, always useful on measuring
query performance and status. Counters that everyone is interested. should
rarely change and if it does we will make some effort to notify users.

* STABLE_LOW - Low level and stable counters. Interesting counters to monitor
 and analyze by machine. It will probably be interesting under some
 circumstance for users. Lots of developers are interested.

* Unstable - Unstable but useful. Useful to understand query performance,
but subject to change, particularly if the implementation changes.
E.g. MaterializeTupleTimer

* Debug -  Debugging counters. Generally not useful to users of Impala,
 the main use case is low-level debugging. Can be hidden to reduce noise
 for most consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 598 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/7
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-09 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 7:

(7 comments)

Hi Tim
Thanks for the feedback. Have adopted the changes.

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@79
PS5, Line 79: d collection items read by the scan. Only created f
> Yeah you should remove this, it's only included in scan nodes where its rel
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85
PS5, Line 85: average nu
> maybe "each type of remote filesystem"?
Another question
"Is HDFS remote read overall a single disk queue, or individual remote read 
targets? Same for S3"


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@88
PS5, Line 88: ty of the sys
> I think I wrote the original text and it's hard to quantity - it's really t
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@121
PS5, Line 121:  time waiting for I/O instead of "
 : "processing data. Note that this includes the
> traditional here means MT_DOP = 0. I guess this is kinda an implementation
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@58
PS5, Line 58: COUNTER(RowsRead, STABLE_HIGH, TUnit::UNIT, "Number of top-level "
: "rows/tuples read fr
> Yeah you could drop the "only implemented" sentence, since it won't appear
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@97
PS5, Line 97: OUBLE_VALUE,
: "The ratio between TotalScanByteSent and TotalBytesRead, i.e
> Yeah I think it's comparing apples and oranges anyway, since one is on-disk
Remove the sentence "i.e" for now. Maybe Lars can give some thoughts.


http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl
File www/query_profile.tmpl:

http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl@40
PS5, Line 40: iv>
:
: {{profile}}
:
: 

[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-05 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 6:

(27 comments)

Hi Balazs,

Thanks so much for help correct the profile counters descriptions!

That's really helpful! Agree that we need to keep code out of stable counters 
descriptions. Unfortunately, I only have so little knowledge on the scan 
counters so that a lot of counters I am not sure how to fix the description. 
Ask @Tim Armstrong and @Lars to help take a look on that.

Also, if you have strong options how to update the counters. Please feel free 
to let me know. Have already applied a few suggestions you made.

Thanks
Jiawei

http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG@37
PS5, Line 37: 2. Profile counters are also annotated with their significance to 
users.
: * STABLE_HIGH - High level and stable counters, always useful on 
measuring
: query performance and status. Counters that everyone is 
interested. should
: rarely change and if it does we will make some effort to notify 
users.
:
: * STABLE_LOW - Low level and stable counters. Interesting 
counters to monitor
:  and analyze by machine. It will probably be interesting under 
some
:  circumstance for users. Lots of developers are interested.
:
: * Unstable - Unstable but useful. Useful to understand query 
performance,
: but subject to change, particularly if the implementation changes.
: E.g. MaterializeTupleTimer
:
: * Debug -  Debugging counters. Generally not useful to users of 
Impala,
:  the main use case is low-level debugging. Can be hidden to 
reduce noise
:  for most consumers of profiles.
:
: 3. We have around 250 counters. This commit did the replacement in
: scan-node and hdfs-scan-node-base and coordinator.
:
> Please simplify this to what's in the code - I think the descriptions in Si
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@70
PS5, Line 70: across all Disk I/O threads in HDFS read operations.");
: PROFILE_DEFINE_TIMER(TotalRawHdfsOpenFileTime, STABLE_LOW, "The 
tota
> Instead of including an explanation in individual description (but not in e
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@79
PS5, Line 79: lectionItemsRead, STABLE_LOW, TUnit::UNIT,
> Is this included in profiles where it's irrelevant? If no, remove this to a
Not sure about it.
@Lars and @Tarmstrong maybe can help to take a look?


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85
PS5, Line 85: _DEFINE_SA
> Maybe '...remote data source...'? Disk queue is not a term users will be fa
@Lars and @Tarmstrong


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@88
PS5, Line 88: ty of the sys
> Than what? - Is there a point of comparison we can provide? For example, 'v
Not sure... if someone have better knowledge than this might want to fix it...


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@89
PS5, Line 89:  the s
> scan
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@90
PS5, Line 90: it::BYTES,
> I don't think we should speculate on root causes - there can be many reason
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@121
PS5, Line 121:  the scanner thread was ready to process "
 : "the data. High values show that scanner threa
> Isn't this the same as 'HDFS scans'?
IDK to be honest... @Tarmstrong and @Lars


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@125
PS5, Line 125: pressedBytesReadPerColumn, STABLE_LOW,
 : TUnit::BYTES, "Stats a
> replace: Note that this includes the time when
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@130
PS5, Line 130: , "Stats about the number of compressed bytes read per column. "
 : "Each sample in the counter is the size of a single column 
that is scanned by the "
 : "scan node.");
> remove?
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@136
PS5, Line 136:  of data cache partially hit");
 : PROFILE_DEFINE_COUNTER(DataCacheMissCount, STABLE_HIGH, 
TUnit::UNIT,
 : "Total count of data
> remove?
Done


http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

ht

[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-05 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#6).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are also annotated with their significance to users.
* STABLE_HIGH - High level and stable counters, always useful on measuring
query performance and status. Counters that everyone is interested. should
rarely change and if it does we will make some effort to notify users.

* STABLE_LOW - Low level and stable counters. Interesting counters to monitor
 and analyze by machine. It will probably be interesting under some
 circumstance for users. Lots of developers are interested.

* Unstable - Unstable but useful. Useful to understand query performance,
but subject to change, particularly if the implementation changes.
E.g. MaterializeTupleTimer

* Debug -  Debugging counters. Generally not useful to users of Impala,
 the main use case is low-level debugging. Can be hidden to reduce noise
 for most consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/queries.tmpl
M www/query_profile.tmpl
16 files changed, 605 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/6
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#5).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are annotated with their stability:
* Stable counters - generally useful to understand query performance,
should only change rarely and if it does we'll make some effort to
notify users. E.g. BytesRead.
* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer
* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

3. Profile counters are also annotated with their significance to users.
* Critical level counters - always useful on measuring query performance and 
status.
Counters that everyone are interested.
* High level counters - generally interesting counters. Most of the users will 
be
interested and all the developers are very interested.
* Medium level counters - somehow interesting counters to monitor. It will 
probably be
interesting under some circumstance. Lot of developers are interested.
* Low level counters - not interesting to users. Should be useful for developers
to debug only.

4. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 616 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/5
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 4:

(1 comment)

Thanks for the advice!

http://gerrit.cloudera.org:8080/#/c/14776/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14776/3//COMMIT_MSG@10
PS3, Line 10: generating counters from a counter registry. All the profile 
counters
> Sorry I wrote comment on wrong review...
Thanks for the suggestions. I edit the commit message. Hopefully it will be 
more clear to you... I will keep editing it in the following commits.



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 04 Dec 2019 22:58:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, 
Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14776

to look at the new patch set (#4).

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This change changes the way developers define profile counters by
generating counters from a counter registry. All the profile counters
will be register there first and then used in the same way as before.
By doing so, we will be able to manage profile counters in a way that
we can define profile counters documentation. For example:

Declaration:
PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT,
"Number of backends running this query.");

Initialization:
COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends);

This shall be how we define a NumBackends counter. It follows with
its significance, type, description in the declaration part.

Users now will be able to view profile counters documentation under
query_profile page, there is a Profile Documentation button which
leads to /profile_docs.

More details:
This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are annotated with their stability:
* Stable counters - generally useful to understand query performance,
should only change rarely and if it does we'll make some effort to
notify users. E.g. BytesRead.
* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer
* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

3. Profile counters are also annotated with their significance to users.
* Critical level counters - always useful on measuring query performance and 
status.
Counters that everyone are interested.
* High level counters - generally interesting counters. Most of the users will 
be
interested and all the developers are very interested.
* Medium level counters - somehow interesting counters to monitor. It will 
probably be
interesting under some circumstance. Lot of developers are interested.
* Low level counters - not interesting to users. Should be useful for developers
to debug only.

4. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

Concers:
The downside is that we will have duplicate comments of query profiles
both in the header file and the .cc file.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 619 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/4
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has abandoned this change. ( http://gerrit.cloudera.org:8080/14837 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/14837
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355
Gerrit-Change-Number: 14837
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 2:

(10 comments)

Hi Tim, thanks for the advice.

I recently updated my Github Username and that lead the Gerrit not able to 
recognize my email and I dont find a way to correct that... So I have to create 
a new PR using another email.

https://gerrit.cloudera.org/#/c/14837/

The code is here... Sorry for the inconvenience.

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/hbase-scan-node.cc
File be/src/exec/hbase-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/hbase-scan-node.cc@39
PS2, Line 39: PROFILE_DEFINE_TIMER(TotalRawHBaseReadTime, DEBUG,
> I think this could be a stable counter (I can see either low or high level)
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.h
File be/src/exec/scan-node.h:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.h@35
PS2, Line 35: /// Includes ScanNode common counters:
> I think it's probably useful to keep these comments around for the time bei
Done.
I am adding all the comments back.


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@56
PS2, Line 56: STABLE_LOW
> I think this should be STABLE_HIGH, I think the number of bytes read is gen
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@72
PS2, Line 72: PROFILE_DEFINE_COUNTER(NumScannerThreadsStarted, STABLE_LOW, 
TUnit::UNIT,
> I would mark this as UNSTABLE or DEBUG - it's a pretty useless counter in p
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@92
PS2, Line 92: DEBUG
> STABLE_LOW. This and PeakScannerThreadConcurrency are pretty useful for und
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@95
PS2, Line 95: DEBUG
> STABLE_LOW
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator-backend-state.cc@61
PS2, Line 61: // TODO: the counters are inside the impala namespace, so we need 
to declare them there,
> Need to remove TODO before merging. I think it makes sense just to switch f
Sure, I will remove it. I was just thinking leave it so that the following up 
tasks can be done in the same way.


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator.cc@80
PS2, Line 80: STABLE_HIGH
> Probably STABLE_LOW, this is a little tricky to interpret
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/default-path-handlers.cc@130
PS2, Line 130: // name, description, unit
> Comment needs updating?
Done


http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/runtime-profile.cc@68
PS2, Line 68:   boost::lock_guard l(lock_);
> nit: don't need to use boost:: and std:: prefixes for lock_guard and vector
Done



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 04 Dec 2019 22:22:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-12-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14837


Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This is the work based on Lars' experiment on
https://gerrit.cloudera.org/#/c/12116/

This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are annotated with their stability:
* Stable_HIGH counters - High level and stable counters. Always useful
on measuring query performance and status. Counters that everyone is
interested. Should rarely change and if it does we will make some
effort to notify users.

* Stable_LOW - Low level and sable counters - interesting counters to
monitor and analyze by machine. It will probably be interesting
under some circumstance for users. Lots of developers are interested.

* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer

* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

The downside is that we'd reduce the comments that currently explain
some of the counters in header files by moving them to the .cc files.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 619 insertions(+), 273 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14837/1
--
To view, visit http://gerrit.cloudera.org:8080/14837
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355
Gerrit-Change-Number: 14837
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-11-27 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 2:

(9 comments)

Hi Tim,

Thanks for the advice. The second round combined stability with significance. 
Now counters have 4 types which are STABLE_HIGH, STABLE_LOW, UNSTABLE, DEBUG

Also, the second round I replaced all counters in scan-node.cc

One thing I am not sure is that whether we should remove the previous comments 
on all the counters?

Please let me know if you have other thoughts.

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525
PS1, Line 525:   /// Disk accessed bitmap
> These are weird cause they're not actually included in the profile directly
Done


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127
PS1, Line 127: 
PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, 
STABLE_LOW,
> I'm OK with the redundant text, but it might be reasonable to use a templat
hmm, I will just leave it here then because we only have two counters and the 
contents are subject to change.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82
PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, "Wall 
clock time that the "
> We could probably define an enum or something like that. I'd be ok with jus
I will just leave it in the description.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@95
PS1, Line 95: 
PROFILE_DEFINE_HIGH_WATER_MARK_COUNTER(PeakScannerThreadConcurrency, DEBUG,
> I think doing this makes sense - that would force all counters to be define
Done


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc@60
PS1, Line 60:
> This approach of converting the namespaces where needed seems fine to me. T
Okay, will adopt the second approach whenever needed.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@154
PS1, Line 154: // Debugging counters - generally not useful to users of 
Impala, the main use case is
> nit: formatting is weird, I'd expect the enum values to only be indented tw
Done


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164
PS1, Line 164:   const char* name() const { return name_; }
> I think we should probably combine them, because they're pretty correlated
Okay, I will split stable into high_stable and low_stable and keep unstable and 
debug the same.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@189
PS1, Line 189: /// prototypes will register with the singleton instance of this 
class. Then, this class
> Are you planning to do this?
I dont think so, unless you feel like we will need this. Otherwise I will just 
delete this


http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl
File www/profile_docs.tmpl:

http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1
PS1, Line 1: 

[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-11-27 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This is the work based on Lars' experiment on
https://gerrit.cloudera.org/#/c/12116/

This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are annotated with their stability:
* Stable_HIGH counters - High level and stable counters. Always useful
on measuring query performance and status. Counters that everyone is
interested. Should rarely change and if it does we will make some
effort to notify users.

* Stable_LOW - Low level and sable counters - interesting counters to
monitor and analyze by machine. It will probably be interesting
under some circumstance for users. Lots of developers are interested.

* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer

* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

3. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

This is still a WIP work and all the advices are welcomed!

The downside is that we'd reduce the comments that currently explain
some of the counters in header files by moving them to the .cc files.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
M www/query_profile.tmpl
15 files changed, 620 insertions(+), 292 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/2
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-21 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..


Patch Set 9:

rebase the commit to solve merge conflict.


--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 22 Nov 2019 05:37:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-21 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14611

to look at the new patch set (#9).

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

A. Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

B. Solution:
- We added "hms-load-tbl-schema", "load-duration.all-column-stats",
"load-duration.all-partitions.total-time",
"load-duration.all-partitions.file-metadata".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration.total-time

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: load-duration.all-column-stats

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: we rename it to load-duration.storage-metadata. This is a metric 
introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: load-duration.all-partitions

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: load-duration.all-partitions.file-metadata

C. Extra thing in this commit:
1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd
2. Add explanation for table loading manager

D. Test:
1. Add Unit tests for PrintUtils.printTime() function
2. Manual describe table and verify the table loading metrics are
correct.

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
10 files changed, 186 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/9
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-11-21 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
..


Patch Set 1:

(6 comments)

Hi Tim, Lars, other Impala team members

This is the first edition of impala profile counters that I had based on Lars' 
previous work on https://gerrit.cloudera.org/#/c/12116/
Thanks so much for Lars building the basis.

I had several comments on the commit itself. Any suggestions on how to improve 
it is highly welcomed!

Thanks so much!

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525
PS1, Line 525:   /// Disk accessed bitmap
I am not too sure about the following two counters. Wondering if we should just 
leave it like it is.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127
PS1, Line 127: 
PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, 
STABLE,
This seems redundant, any advice to modify it will be appreciated.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@139
PS1, Line 139: PROFILE_DEFINE_COUNTER(DataCacheHitCount, STABLE, HIGH, 
TUnit::UNIT,
Did not find the explanation for the following data cache counters. Might need 
some other better explanation.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82
PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, LOW, 
"Wall clock time that the "
Hi, Tim
I believe these are the Timer you mentioned in the threads. Any suggestions on 
how should we handle it other than put it in the descriptions?


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164
PS1, Line 164:   enum class Significance {
I add this Significance to measure if the counter is important to a user. Maybe 
we can also combine this with stability or redefine the stability? I am not too 
sure about this.


http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl
File www/profile_docs.tmpl:

http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1
PS1, Line 1: 

[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters

2019-11-21 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14776


Change subject: IMPALA-7550: Add documentation to profile counters
..

IMPALA-7550: Add documentation to profile counters

This is the work based on Lars' experiment on
https://gerrit.cloudera.org/#/c/12116/

This commit did the following refactors on profile counters.
1. Add a singleton registry for runtime profile counters prototypes,
similiar to what Kudu does for metrics. This allows us to generate
profile documentation for all counters from the code. We add
/profile_docs and a correspoding UI for the documentation of profile
counters.

2. Profile counters are annotated with their stability:
* Stable counters - generally useful to understand query performance,
should only change rarely and if it does we'll make some effort to
notify users. E.g. BytesRead.
* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer
* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

3. Profile counters are also annotated with their significance to users.
* Critical level counters - always useful on measuring query performance and 
status.
Counters that everyone are interested.
* High level counters - generally interesting counters. Most of the users will 
be
interested and all the developers are very interested.
* Medium level counters - somehow interesting counters to monitor. It will 
probably be
interesting under some circumstance. Lot of developers are interested.
* Low level counters - not interesting to users. Should be useful for developers
to debug only.

4. We have around 250 counters. This commit did the replacement in
scan-node and hdfs-scan-node-base and coordinator.

This is still a WIP work and all the advices are welcomed!

The downside is that we'd reduce the comments that currently explain
some of the counters in header files by moving them to the .cc files.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
A www/profile_docs.tmpl
12 files changed, 671 insertions(+), 296 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/1
--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-21 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14611

to look at the new patch set (#7).

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

A. Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

B. Solution:
- We added "hms-load-tbl-schema", "load-duration.all-column-stats",
"load-duration.all-partitions.total-time",
"load-duration.all-partitions.file-metadata".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration.total-time

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: load-duration.all-column-stats

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: we rename it to load-duration.storage-metadata. This is a metric 
introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: load-duration.all-partitions

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: load-duration.all-partitions.file-metadata

C. Extra thing in this commit:
1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd
2. Add explanation for table loading manager

D. Test:
1. Add Unit tests for PrintUtils.printTime() function
2. Manual describe table and verify the table loading metrics are
correct.

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
10 files changed, 186 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/7
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-21 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..


Patch Set 7:

(3 comments)

Thanks for the review. Let me know if we need add anything else in this commit.

http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@163
PS6, Line 163: load-duration.all-partitions";
> I understand that the naming is used to represent nesting of the metrics. S
Done


http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/Table.java@149
PS6, Line 149: load-duration";
> Since this is the top level load-duration adding a specific --total-time is
Done


http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@262
PS6, Line 262: There is a discussion here: https://issues.apache.org/jira/brows
> May be just say (See discussion in ..) :)
Done



--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 21 Nov 2019 20:28:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-19 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14611

to look at the new patch set (#6).

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

A. Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

B. Solution:
- We added "hms-load-tbl-schema", "load-duration.all-column-stats",
"load-duration.all-partitions.total-time",
"load-duration.all-partitions.file-metadata".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration.total-time

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: load-duration.all-column-stats

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: we rename it to load-duration.storage-metadata. This is a metric 
introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: load-duration.all-partitions

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: load-duration.all-partitions.file-metadata

C. Extra thing in this commit:
1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd
2. Add explanation for table loading manager

D. Test:
1. Add Unit tests for PrintUtils.printTime() function
2. Manual describe table and verify the table loading metrics are
correct.

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
10 files changed, 188 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/6
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-10 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14611

to look at the new patch set (#5).

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

Solution:
- We added "hms-load-tbl-schema", "load-duration-all-column-stats",
"all-partitions-load-duration",
"file-metadata-all-partitions-load-duration".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: all-column-stats-load-duration

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: storage-metadata-load-duration. This is a metric introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: all-partitions-load-duration

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: file-metadata-all-partitions-load-duration

- Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd

Test:
1. Will add metrics test after the few rounds of code review.
2. Add Unit tests for PrintUtils.printTime() function

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
8 files changed, 153 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/5
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-10 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

Solution:
- We added "hms-load-tbl-schema", "load-duration-all-column-stats",
"all-partitions-load-duration",
"file-metadata-all-partitions-load-duration".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: all-column-stats-load-duration

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: storage-metadata-load-duration. This is a metric introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: all-partitions-load-duration

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: file-metadata-all-partitions-load-duration

- Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd

Test:
1. Will add metrics test after the few rounds of code review.
2. Add Unit tests for PrintUtils.printTime() function

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
8 files changed, 156 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/4
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-10 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

Solution:
- We added "hms-load-tbl-schema", "load-duration-all-column-stats",
"all-partitions-load-duration",
"file-metadata-all-partitions-load-duration".
Also, we logged the loadValidWriteIdList() time. So now we have
a more detailed breakdown time for table loading info.

The table loading time metrics for HDFS tables are in the following hierarchy:
- Table Schema Loading
- Table Metadata Loading - total time
- all column stats loading time
- ValidWriteIds loading time
- all partitions loading time - total time
- file metadata loading time
- storage-metadata-loading-time(standalone metric)

1. Table Schema Loading:
* Meaning: The time for HMS to fetch table object and the real schema loading 
time.
Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)"
* Metric : hms-load-tbl-schema

2. Table Metadata Loading -- total time
* Meaning: The time to load all the table metadata.
The code path is load() function in HdfsTable.load() function.
* Metric: load-duration

2.1 Table Metadata Loading -- all column stats
* Meaning: load all column stats, this is part of table metadata loading
The code path is HdfsTable.loadAllColumnStats()
* Metric: all-column-stats-load-duration

2.2 Table Metadata Loading -- loadValidWriteIdList
* Meaning: fetch ValidWriteIds from HMS
The code path is HdfsTable.loadValidWriteIdList()
* Metric: no metric recorded for this one. Instead, a debug log is
generated.

2.3 Table Metadata Loading -- storage metadata loading(standalone metric)
* Meaning: Storage related to file system operations during metadata
loading.(The amount of time spent loading metadata from the underlying storage 
layer.)
* Metric: storage-metadata-load-duration. This is a metric introduced by
IMPALA-7322

2.4 Table Metadata Loading -- load all partitions
* Meaning: Load all partitions time, including fetching all partitions
from HMS and loading all partitions. The code path is
MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions()
* Metric: all-partitions-load-duration

2.4.1 Table Metadata Loading -- load all partitions -- load file metadata
* Meaning: The file metadata loading for all all partitions. (This is
part of 2.4). Code path: loadFileMetadataForPartitions() inside
loadAllPartitions()
* Metric: file-metadata-all-partitions-load-duration

- Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd

Test:
1. Will add metrics test after the few rounds of code review.
2. Add Unit tests for PrintUtils.printTime() function

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
8 files changed, 156 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/3
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-07 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..


Patch Set 11:

(2 comments)

Hi Vihang, thanks for the feedback.

Let me know if you need more clarification.

http://gerrit.cloudera.org:8080/#/c/14600/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/14600/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@993
PS11, Line 993: "warning threshold. Time: " + load_time_duration + 
" ns");
I implemented a PrettyPrint() function on front-end in IMPALA-9110. Will 
refactor the time printing in that ticket.


http://gerrit.cloudera.org:8080/#/c/14600/10/www/scripts/util.js
File www/scripts/util.js:

http://gerrit.cloudera.org:8080/#/c/14600/10/www/scripts/util.js@52
PS10, Line 52: // if hour is true, the time is large enough and we should
 : // ignore the remaining time on second level
 : if (!hour && value >= 1000) {
 : re += (Math.floor(value / 1000) + "s");
 :
> More comments will be helpful to understand this function better. Why do we
The output for this will be 10h. I am using the same logic as 
https://github.com/apache/impala/blob/288abf10f6142ae6cb02329604805a9a1dcc804f/be/src/util/pretty-printer.h#L249

I think if "hour" is true, then the time will be large enough and we should 
ignore the seconds? That should be the original idea.



--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 11
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 08 Nov 2019 05:58:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-07 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Xiaomeng Zhang, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14600

to look at the new patch set (#11).

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

However, there might be a problem here because we only keep the
capacity size to 100. For example, there might be case like a
table has higher median loading time but has lower Maximum
loading time which cannot make itself to the Top-100. For now,
we will ignore case like that because we are aiming to find
the tables with maximum longest loading time.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. But
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 307 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/11
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 11
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..


Patch Set 2:

(1 comment)

Also, I renamed the startTableLoadingThreads to 
startTableLoadingSubmitterThreads. The current code is kind of confusing to 
understand...

http://gerrit.cloudera.org:8080/#/c/14611/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/14611/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@244
PS2, Line 244:   //LOG.info();
nit: remove this



--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 06 Nov 2019 06:04:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..


Patch Set 2:

(4 comments)

Thanks for the feedback! I add the time for measuring HMS loading table schema.

For the table loading queue time, Quanlong suggest we should do the following:

"Table queue time is not related to the table itself. But just like query queue 
time, it's an important metric to understand why the loading is slow.

I think we can track it as a global metric that not bind to any tables. Showing 
p90th table loading queue time in recent time window (e.g. 1h) can be helpful."

It makes a lot of sense to me. But I am wondering how will we approve this. 
Might need some guidance on getting the queue time done.

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@962
PS1, Line 962:
> Does this also account for the time taken for loading file metadata? If yes
I am not sure if I understand the question here.

The time taken for updatePartitionsFromHms() will also contribute to 
storageMetadataLoadTime_ (which is time spent in the source systems 
loading/reloading the fs metadata for the table)

And we have this metrics "storage-metadata-load-duration" to keep track of the 
storageMetadataLoadTime_.


http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@975
PS1, Line 975: _ = load
> I would recommend something like Time Taken:  ns. here. Also, loggin
Adding a PrettyPrint function for Time in frontend. Done.


http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1229
PS1, Line 1229:   /**
> I think this function does not invoke any HMS RPCs so the time will be very
Done


http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java@334
PS1, Line 334:   }
> I don't think there is much value in having this since we don't expect this
Done



--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 06 Nov 2019 06:00:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14611

to look at the new patch set (#2).

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

Problem:
Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function. Also, the table schema
loading is not taken into account for load-duration. We will need
to add some more metrics for that.

Solution:
- We added "hms-load-tbl-schema", "load-duration-all-column-stats".
Also, we logged the loadValidWriteIdList() time. And we found
that "storage-metadata-load-duration" has already represent
partition and file metadata loading. So now we have a more
detailed breakdown time for table loading info.
- Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd

Test:
1. Will add metrics test after the few rounds of code review.
2. Add Unit tests for PrintUtils.printTime() function

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java
8 files changed, 148 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/2
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-05 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Xiaomeng Zhang, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14600

to look at the new patch set (#9).

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

However, there might be a problem here because we only keep the
capacity size to 100. For example, there might be case like a
table has higher median loading time but has lower Maximum
loading time which cannot make itself to the Top-100. For now,
we will ignore case like that because we are aiming to find
the tables with maximum longest loading time.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. But
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 297 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/9
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..


Patch Set 1:

(1 comment)

The table loading process is kind of confusing to anyone reading it for the 
first time. Here are some good explanation from Quanlong:

" There are two kinds of table loading requests:
1) Async: load requests from invalidate metadata when 
--load_table_in_background=true. PrioritizedLoad requests from coordinators 
when missing table meta in analysis.
2) Sync: load requests from DDLs and some RPCs like getPartitionStats and 
getPartialCatalogObject.

All async requests are put in tableLoadingDeque_ and deduplicated by 
tableLoadingBarrier_. tableLoadingDeque_ is consumed by threads launched in 
startTableLoadingThreads(). These threads call 
CatalogServiceCatalog.getOrLoadTable() finally to load the table. Note that 
getOrLoadTable() is sync. It only returns when table already loaded or loading 
finish.
All sync requests will also go to CatalogServiceCatalog.getOrLoadTable(). All 
loading threads are running in tblLoadingPool_.

So threads in startTableLoadingThreads() are just submitters for background 
load requests. All table loading threads are in tblLoadingPool"

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java@30
PS1, Line 30: import com.codahale.metrics.Timer;
> nit: looks like we import this in below section just before "import com.goo
Done



--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 05 Nov 2019 18:22:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-04 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Xiaomeng Zhang, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14600

to look at the new patch set (#8).

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

However, there might be a problem here because we only keep the
capacity size to 100. For example, there might be case like a
table has higher median loading time but has lower Maximum
loading time which cannot make itself to the Top-100. For now,
we will ignore case like that because we are aiming to find
the tables with maximum longest loading time.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. But
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 297 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/8
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 8
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

However, there might be a problem here because we only keep the
capacity size to 100. For example, there might be case like a
table has higher median loading time but has lower Maximum
loading time which cannot make itself to the Top-100. For now,
we will ignore case like that because we are aiming to find
the tables with maximum longest loading time.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. But
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 297 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/7
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-04 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Yongzhi Chen, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14600

to look at the new patch set (#6).

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

However, there might be a problem here because we only keep the
capacity size to 100. For example, there might be case like a
table has higher median loading time but has lower Maximum
loading time which cannot make itself to the Top-100. For now,
we will ignore case like that because we are aiming to find
the tables with maximum longest loading time.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. But
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 269 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/6
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..


Patch Set 5:

(3 comments)

Thanks for the valid feedback!

I added max, 75, 95, 99, count in the new loading time ranking table. The 
ranking is sorted by maximum loading time by default. However, there might be a 
problem here because we only keep the capacity size to 100. For example, there 
might be case like a table has higher median loading time but has lower Maximum 
loading time which cannot make itself to the Top-100.

Users can sort them by other metrics in UI. Also, the more detailed metrics can 
be found in the metrics column. So I don't feel like we need to expose all of 
them.

http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift@724
PS4, Line 724:   5: optional i64 median_table_loading_ns
> It'd be better to not just expose the median loading time. Since this is sh
Done


http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java@203
PS4, Line 203:
> Can we add more metrics so in the future we don't need to touch this part a
Done


http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl@169
PS4, Line 169: Median Loading Time
> We can show human readable time in this column, i.e. in the forms of 11m25s
Done



--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 04 Nov 2019 22:42:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-04 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time).
Set default tables loading metrics capacity to 100.

Add the sorted table in Catalog server web-ui. The loading
time is sorted by the maximum from load_duration metrics. And
users can sort by other metrics in catalogd debug UI.

Testing:
- Add end-to-end test for webpage to verify the label and text
exist in catalog debug page. Verify all fields are in JSON response
- Launch Impala and activate some tables to see the table loading
time shown successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
M www/scripts/util.js
10 files changed, 266 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/5
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-01 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..


Patch Set 4:

(1 comment)

Thanks for the feedbacks!

http://gerrit.cloudera.org:8080/#/c/14600/3/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/14600/3/tests/webserver/test_web_pages.py@378
PS3, Line 378: loading_tables = response_json["longest_loading_tables"]
> nit: Could you change the variable name?
Done



--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 01 Nov 2019 18:15:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-11-01 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time. Add the sorted table in Catalog server web-ui. The loading
time is cacualted by the median from load_duration metrics.

Testing:
- Add end-to-end test for webpage to verify the label and text
exisit in catalog debug page.
- Launch Impala and activate some tables to see the table loading
time showed successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
9 files changed, 145 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/4
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable

2019-10-31 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14611


Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
..

IMPALA-9110: Add table loading time break-down metrics for HdfsTable

Catalog table loading currently only records the total loading
time. We will need some break-down times, i.e. more detailed
time recording on each loading function.

We added "table-schema-load-duration", "all-column-stats-load
-duration", "all-column-stats-load-duration". And we found that
"storage-metadata-load-duration" has already represent partition
and file metadata loading. So now we have a more detailed
breakdown time for table loading info.

Test:
Will add test after the first round review.

Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
2 files changed, 60 insertions(+), 40 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/1
--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-10-31 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/14600 )

Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time. Add the sorted table in Catalog server web-ui. The loading
time is cacualted by the median from load_duration metrics.

Testing:
- Add end-to-end test for webpage to verify the label and text
exisit in catalog debug page.
- Launch Impala and activate some tables to see the table loading
time showed successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
9 files changed, 145 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/3
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue

2019-10-31 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14589

to look at the new patch set (#3).

Change subject: IMPALA-9105: Catalog debug page top-n table has a URL 
generation issue
..

IMPALA-9105: Catalog debug page top-n table has a URL generation issue

What's the problem:
Previously, catalogd debug page does not have 'fqtn' property. Because
CatalogServer::GetCatalogUsage() does not pass fqtn to the template.
Thus, the generated page table detail page cannot be opened. Also,
Someone might accedentially deleted  which makes the catalogd UI
top-K page unorganized.

Fix:
Correct the URL generation part.

Test:
Open the page and all the tables object can be correctly opened.

Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
---
M www/catalog.tmpl
1 file changed, 4 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/3
--
To view, visit http://gerrit.cloudera.org:8080/14589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
Gerrit-Change-Number: 14589
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

2019-10-30 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14600


Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
..

IMPALA-9109: Add top-k metadata loading ranking on catalogd UI

Add functions in CatalogUsageMonitor to monitor and report the
catalog usage of the tables have the longest metadata loading
time. Add the sorted table in Catalog server web-ui. The loading
time is cacualted by the median from load_duration metrics.

Testing:
Launch Impala and activate some tables to see the table loading
time showed successfully on the catalog debug UI page.

Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M www/catalog.tmpl
8 files changed, 132 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/2
--
To view, visit http://gerrit.cloudera.org:8080/14600
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf
Gerrit-Change-Number: 14600
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 


[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue

2019-10-30 Thread Jiawei Wang (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14589

to look at the new patch set (#2).

Change subject: IMPALA-9105: Catalog debug page top-n table has a URL 
generation issue
..

IMPALA-9105: Catalog debug page top-n table has a URL generation issue

What's the problem:
Previously, catalogd debug page does not have 'fqtn' property. Because
CatalogServer::GetCatalogUsage() does not pass fqtn to the template.
Thus, the generated page table detail page cannot be opened. Also,
Someone might accedentially deleted  which makes the catalogd UI
top-K page unorganized.

Fix:
Correct the URL generation part.

Test:
Open the page and all the tables object can be correctly opened.

Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
---
M www/catalog.tmpl
1 file changed, 4 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/2
--
To view, visit http://gerrit.cloudera.org:8080/14589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
Gerrit-Change-Number: 14589
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue

2019-10-30 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14589 )

Change subject: IMPALA-9105: Catalog debug page top-n table has a URL 
generation issue
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl@42
PS1, Line 42:   
> Oops!
Yeah, I think someone accidentally deleted this.


http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl@47
PS1, Line 47:{{name}}
> Is {{fqtn}} empty here causing the problem? Looks like we still use {{fqtn}
Yes, fqtn empty caused the issue. The problem is that these "Top-n" tables are 
passed in a different function in CatalogServer::GetCatalogUsage. And we did 
not pass fqtn there.



--
To view, visit http://gerrit.cloudera.org:8080/14589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
Gerrit-Change-Number: 14589
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 31 Oct 2019 03:34:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue

2019-10-30 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14589


Change subject: IMPALA-9105: Catalog debug page top-n table has a URL 
generation issue
..

IMPALA-9105: Catalog debug page top-n table has a URL generation issue

What's the problem:
Previously, catalogd debug page does not have 'fqtn' property.
Thus, the generated page table detail page cannot be opened.

Fix:
Correct the URL generation part.

Test:
Open the page and all the tables object can be correctly opened.

Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
---
M www/catalog.tmpl
1 file changed, 4 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/1
--
To view, visit http://gerrit.cloudera.org:8080/14589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37
Gerrit-Change-Number: 14589
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-08-06 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 14:

(14 comments)

> Uploaded patch set 14.

This has been a long process, thanks for your patience on this!

http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc@848
PS13, Line 848: Of
> offset: $1
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc@2026
PS13, Line 2026: json_profile(rapidj
> json_profile. No need for "_rapid".
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc
File be/src/util/runtime-profile-test.cc:

http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1166
PS13, Line 1166:   RuntimeProfile* profile_ab = RuntimeProfile::Create(&pool, 
"ProfileAb");
> The name of the profile should probably match the variable name.
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1204
PS13, Line 1204:   for (auto& itr : content["counters"].GetArray()) {
   : // check normal Counter
> Use iterator style like the following. Same at other places.
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1218
PS13, Line 1218:   }
> Do we want to check there are no other counter types in content ?
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h@622
PS13, Line 622: A map of counters name to counter
> A map of counter's name to counter
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h@625
PS13, Line 625: ame, const Counte
> This needs to be documented too.
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.cc@704
PS13, Line 704: e information
> What does all the other information mean ?
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@571
PS13, Line 571:download_link = 
"query_profile_plain_text?query_id={0}".format(query_id)
  : assert down
> nit: one line
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@587
PS13, Line 587:
> nit: indent 4 for line continuation
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@594
PS13, Line 594: le:{0}".
> Downloaded ?
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@595
PS13, Line 595: the query id
> Json pofile
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@597
PS13, Line 597:
> assert query_id in json_res["contents"]["profile_name"], json_res to help d
Done


http://gerrit.cloudera.org:8080/#/c/13801/13/www/query_profile.tmpl
File www/query_profile.tmpl:

http://gerrit.cloudera.org:8080/#/c/13801/13/www/query_profile.tmpl@28
PS13, Line 28: (Available Form
> (available formats):
Done



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 14
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 06 Aug 2019 20:12:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-08-06 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Future compatibility:
The schema of the JSON format can be changed in the future with
the standardization of Profile and Counter structure.

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 646 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/14
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 14
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-08-05 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Future compatibility:
The schema of the JSON format can be changed in the future with
the standardization of Profile and Counter structure.

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 649 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/13
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 13
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-30 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Future compatibility:
The schema of the JSON format can be changed in the future with
the standardization of Profile and Counter structure.

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 642 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/12
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 12
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-26 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Future compatibility:
The schema of the JSON format can be changed in the future with
the standardization of Profile and Counter structure.

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 642 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/11
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 11
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-26 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 10:

(23 comments)

Thanks for the valuable feedback! Looking forward to the next step and the 
version control discussion.

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-hs2-server.cc@988
PS9, Line 988:   } else {
> nit: missing blank space after }
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@101
PS9, Line 101: #include "common/names.h"
 :
 : using boost::adopt_lock_t;
 : using boost::algorithm::is_an
> Why not group these together with other rapidjson #include above ? There se
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@783
PS9, Line 783: turn Status::OK();
> DCHECK_EQ(format, TRuntimeProfileFormat::STRING);
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@818
PS9, Line 818:
> DCHECK_EQ(format, TRuntimeProfileFormat::STRING);
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@126
PS9, Line 126:   string CounterType() const override {
> void ToJson(rapidjson::Document& document, rapidjson::Value* val) const ove
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@131
PS9, Line 131:
> Actually, do you need to output this in json ? It seems sufficient to just
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@165
PS9, Line 165:  private:
 :   SampleFunction counter_fn_;
 : };
 :
 : /// An AveragedCounter maintains a set of counters and its value 
is the
 : /// average of the values in that set. The average is updated 
through calls
 : ///
> If you take the suggestion in Counter::ToJson(), you probably don't need th
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@219
PS9, Line 219:  private:
 :   /// Map from counters to their existing values. Modified via 
UpdateCounter().
 :   boost::unordered_map counter
> See comments in Counter::ToJson() on why we may not need this.
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@222
PS9, Line 222:
 :   /// Current sums of values from counter_value_map_. Only one 
of these is used,
 :   /// depending on the unit of the counter. current_double_sum_ 
is used for
 :   /// DOUBLE_VALUE, current_int_sum_ otherwise.
 :   dou
> Same question as above. Do you need to store the internal states (i.e. sum)
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@290
PS9, Line 290:   /// Summary statistics of values seen so far.
> virtual override
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@292
PS9, Line 292: t64_t max_;
> Why is "kind" removed ? May be worth adding a comment here.
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@406
PS9, Line 406: void SortEvents() {
> Please add a comment about the output format.
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@452
PS9, Line 452: ///  “unit” : xxx,
> It'd be nice to add a comment about the output format.
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@583
PS9, Line 583:
 :   void Add(int64_t delta) override {
 : DCHECK(false);
 :   }
 :
 :   string CounterType() const override {
 :
> Similar to comments above, this may not be needed since Counter::ToJson() m
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.h@123
PS9, Line 123: counter_name, value, kind, unit
> is this list actually accurate ? What about "kind"  and "unit" ?
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc@709
PS9, Line 709: }
> nit: unnecessary blank line
Done


http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc@797
PS9, Line 797:   Value event_sequences_rapid(kArrayType);
> nit: 

[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-26 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Future compatibility:
The schema of the JSON format can be changed in the future with
the standardization of Profile and Counter structure.

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 642 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/10
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 10
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-24 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 9:

(4 comments)

Done, thanks for the feedback. I think we need more discussion on the version 
control thing as Andrew and Bharath mentioned.

http://gerrit.cloudera.org:8080/#/c/13801/8/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/13801/8/be/src/util/runtime-profile-counters.h@292
PS8, Line 292: val->RemoveMember("kind");
> Should we not update the kind ? Why remove it?
SummaryStatsCounter and TimeSeriesCounter are treated separately with normal 
counters in our JSON output. So a field like "kind" would be redundant.

“counters”: [
{
“counter_name”: xxx,
“value”: xxx,
“kind”: xxx,
“unit”: xxx,
“child_counters”: [{...}]// same structure as counter
…  //type special values
},{...}
],
“summary_stats_counters” : [
{
“counter_name”: xxx,
“value”: xxx,
“unit”: xxx,
“min”: xxx,
“max”: xxx
“avg”: xxx
“num_of_samples”: xxx

},{...}
]

The difference as you can see from the schema, it's that we already know all 
the counters from summary_stats_counters belongs to SummaryStatsCounter. So it 
would not be necessary to keep the "kind" field again.


http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py@668
PS8, Line 668:
> This looks flaky. Instead assert that they exist before you can loop throug
Done


http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py@670
PS8, Line 670: if ("child_profiles" not in json_res["contents"]) or \
> dump the json_res if the assert fails, helps with debugging test failures.
Done


http://gerrit.cloudera.org:8080/#/c/13801/8/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/13801/8/tests/webserver/test_web_pages.py@594
PS8, Line 594:   assert False, "Download JSON format query profile 
cannot be parsed. " \
> dump json
Done



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 25 Jul 2019 02:20:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-24 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 620 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/9
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 9
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-24 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13893

to look at the new patch set (#4).

Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..

IMPALA-8772: Import Testcase failed for SQL without table refs

Description:
Query like this: select 5 * 4; can generate valid testcase, but
cannot be loaded because it does not involve any table and view
references.

Fix:
Add null check for CatalogOpExecutor when doing testcase data load

Tests:
tests/metadata/test_testcase_builder

Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/metadata/test_testcase_builder.py
2 files changed, 92 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/4
--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-24 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13893 )

Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..


Patch Set 3:

(1 comment)

Thanks for the feedback.

http://gerrit.cloudera.org:8080/#/c/13893/2/tests/metadata/test_testcase_builder.py
File tests/metadata/test_testcase_builder.py:

http://gerrit.cloudera.org:8080/#/c/13893/2/tests/metadata/test_testcase_builder.py@51
PS2, Line 51: try:
:   # Test load testcase works
> you can use delete_file_dir from hdfs_util.py . Also please put it in a fin
Done



--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Comment-Date: Thu, 25 Jul 2019 01:55:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-24 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13893

to look at the new patch set (#3).

Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..

IMPALA-8772: Import Testcase failed for SQL without table refs

Description:
Query like this: select 5 * 4; can generate valid testcase, but
cannot be loaded because it does not involve any table and view
references.

Fix:
Add null check for CatalogOpExecutor when doing testcase data load

Tests:
tests/metadata/test_testcase_builder

Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/metadata/test_testcase_builder.py
2 files changed, 93 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/3
--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-23 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13893 )

Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..


Patch Set 2:

(5 comments)

Thanks for your feedback!

http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py
File tests/metadata/test_testcase_builder.py:

http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@41
PS1, Line 41: ate_query = "
> execute_query_expect_success
Done


http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@43
PS1, Line 43:
> assert len(result.data) == 1 ?
Done


http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@48
PS1, Line 48: self.execute_query_expect_success(self.client, 
testcase_load_query)
> Implement the TODO?
Done


http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@49
PS1, Line 49:
> flake8: W391 blank line at end of file
Done


http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@49
PS1, Line 49:
> Yep, remove these blank lines?
Done



--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Comment-Date: Tue, 23 Jul 2019 18:18:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-23 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13893

to look at the new patch set (#2).

Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..

IMPALA-8772: Import Testcase failed for SQL without table refs

Description:
Query like this: select 5 * 4; can generate valid testcase, but
cannot be loaded because it does not involve any table and view
references.

Fix:
Add null check for CatalogOpExecutor when doing testcase data load

Tests:
tests/metadata/test_testcase_builder

Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/metadata/test_testcase_builder.py
2 files changed, 87 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/2
--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs

2019-07-22 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13893


Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs
..

IMPALA-8772: Import Testcase failed for SQL without table refs

Description:
Query like this: select 5 * 4; can generate valid testcase, but
cannot be loaded because it does not involve any table and view
references.

Fix:
Add null check for CatalogOpExecutor when doing testcase data load

Tests:
tests/metadata/test_testcase_builder

Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/metadata/test_testcase_builder.py
2 files changed, 84 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/1
--
To view, visit http://gerrit.cloudera.org:8080/13893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e
Gerrit-Change-Number: 13893
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-15 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 8:

(17 comments)

Thanks so much for the valid feedback. Have adopted the changes you suggested. 
Also add some more unit tests. Let me know if the code need more changes. 
Thanks!

http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/service/impala-http-handler.cc@270
PS6, Line 270:   if (format != TRuntimeProfileFormat::JSON){
> nit: Add a quick comment why?
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@126
PS6, Line 126:
> nit: space const { .
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@128
PS6, Line 128: rapidjson::Value::MemberIterator type_itr = 
val->FindMember("kind");
> Add DHCECK (type_itr != val->MemberEnd())? (multiple places below too)
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@164
PS6, Line 164:
> same (multiple places below)
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@289
PS6, Line 289:
> why this?
I am moving this away because SummaryStatsCounter and TimeSeriesCounter are all 
separate in our output JSON schema.

“counters”: [
{
“counter_name”: xxx,
“value”: xxx,
“kind”: xxx,
“unit”: xxx,
“child_counters”: [{...}]// same structure as counter
…  //type special values
},{...}
],
“summary_stats_counters” : [
{
“counter_name”: xxx,
“value”: xxx,
“unit”: xxx,
“min”: xxx,
“max”: xxx
“avg”: xxx
“num_of_samples”: xxx

},{...}
]

I feel like it is a redundant content if we add its "kind".

What do you think? I can always add it back if you feel that's better.


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@403
PS6, Line 403:
> move to cc file?
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@421
PS6, Line 421:   EventList events_;
> instantiate value = ...directly in L407?
I am not sure what to do with this... I am using the same schema as metric 
collection

https://github.com/apache/impala/blob/95a1da2d32ea7b28585ad574b22c2bb9dd921029/be/src/util/collection-metrics.cc#L28


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc
File be/src/util/runtime-profile-test.cc:

http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1173
PS6, Line 1173:
> check other units too and assert the unit data?
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1187
PS6, Line 1187:   // Serialize to json
> some tests for info_strings and event_sequences too?
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1202
PS6, Line 1202:   // Check counter value matches
> check other TimeSeriesCounter implementations too?
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@705
PS6, Line 705: );
> remove rapidjson classifiers with using...
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@713
PS6, Line 713:  CounterM
> use d->GetAllocator() (avoid temp variable)
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@735
PS6, Line 735:   }
> formatting
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@1590
PS6, Line 1590:   lock_guard lock(lock_);
> formatting
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@1600
PS6, Line 1600:
> formatting
Done


http://gerrit.cloudera.org:8080/#/c/13801/6/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/13801/6/tests/hs2/test_hs2.py@659
PS6, Line 659: get_profile_req.format = 3  # json format
> why not do this in the above test after L654?
Yeah, I can put this inside the above test.


http://gerrit.cloudera.org:8080/#/c/13801/6/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/13801/6/tests/webserver/test_web_pages.py@585
PS6, Line 585: # the query is in the file.
> merge with test_download_text_profile? You can do something like
Done



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id

[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-15 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Greg Rahn, Quanlong Huang, David Rorke, David 
Knupp, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13801

to look at the new patch set (#8).

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 611 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/8
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 8
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-15 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Greg Rahn, Quanlong Huang, David Rorke, David 
Knupp, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13801

to look at the new patch set (#7).

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Tests:
E2E tests:
tests/webserver/test_web_pages.py - test_download_profile
 * merge text and json format download together
HS2 tests:
tests/hs2/test_hs2.py - test_get_profile
 * add json format test
BE Unit tests:
be/src/util/runtime-profile-test.cc
 * ToJson.RuntimeProfileToJsonTest
 * ToJson.EmptyTest
 * ToJson.EventSequenceToJsonTest
 * ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 611 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/7
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-10 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 6:

(1 comment)

Thanks for your feedback! I also added some tests for this.

http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc@979
PS2, Line 979:   SQLSTATE_GENERAL_ERROR);
 :   if (request.format == TRuntimeProfileFormat::THRIFT) {
 : return_val.__set_thrift_profile(thrift_profile);
 :   } else if (request.format == TRuntimeProfileFormat::JSON) {
 : rapidjson::StringBuffer sb;
 : rapidjson::PrettyWriter writer(sb);
 :
> should there be an equivalent here for JSON?
Done. I am putting the output inside the profile, which is a string.

Also, do we need to adopt this change to impala-beeswax-server? I looked at the 
code there but find that the function there only support text format download.



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:05:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-10 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Greg Rahn, David Rorke, David Knupp, Sahil Takiar, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13801

to look at the new patch set (#6).

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Tests:
E2E tests:
* tests/webserver/test_web_pages.py - test_download_json_profile
HS2 tests:
* tests/hs2/test_hs2.py - test_get_profile_json
BE Unit tests:
* be/src/util/runtime-profile-test.cc
ToJson.RuntimeProfileToJsonTest
ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 525 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/6
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-09 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impala server
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
3.Add tests for E2E test
3.Modify query profile page to different download option

Tests:
E2E tests:
* tests/webserver/test_web_pages.py - test_download_json_profile

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
12 files changed, 405 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/5
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-09 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impala server
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
3.Add tests for E2E test
3.Modify query profile page to different download option

Tests:
E2E tests:
* tests/webserver/test_web_pages.py - test_download_json_profile

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
12 files changed, 410 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/4
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-09 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impala server
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
3.Add tests for E2E test
3.Modify query profile page to different download option

Tests:
E2E tests:
* tests/webserver/test_web_pages.py - test_download_json_profile

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
12 files changed, 410 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/3
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-5973: Provide query plan in JSON format

2019-07-08 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11974 )

Change subject: IMPALA-5973: Provide query plan in JSON format
..


Patch Set 3:

Hi,

I am trying to pick this up. Is there anyone who is familiar with this ticket 
and its process knows where should I start?

Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/11974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f07e2223be58b7323e1ea2fa44b62626cdf4944
Gerrit-Change-Number: 11974
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh (320)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pranay Singh (320)
Gerrit-Comment-Date: Mon, 08 Jul 2019 21:35:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-08 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Design Document:
https://docs.google.com/document/d/
15P_Lmjf1rlZUD4PZLXdUwTE8Lv_ME3NQ9Yf4MITzO2M/edit

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M www/query_profile.tmpl
11 files changed, 364 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/2
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-03 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13801


Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Design Document:
https://docs.google.com/document/d/
15P_Lmjf1rlZUD4PZLXdUwTE8Lv_ME3NQ9Yf4MITzO2M/edit

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M www/query_profile.tmpl
11 files changed, 363 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/1
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8665:Include extra info in error message when date cast fails

2019-06-26 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665:Include extra info in error message when date cast 
fails
..

IMPALA-8665:Include extra info in error message when date cast fails

This change extends the error message Impala yields when casting STRING
to DATE (explicitly or implicitly) fails. The new error message includes
the violating string value.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Example:
select cast('20' as date);
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "20"

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/8
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 8
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-25 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

This change extends the error message Impala yields when casting STRING
to DATE (explicitly or implicitly) fails. The new error message includes
the violating string value.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Example:
select cast('2000' as date);
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000"

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/7
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 7
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-25 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

This change extends the error message Impala yields when casting STRING
to DATE (explicitly or implicitly) fails. The new error message includes
the violating string value.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Example:
select ('2000' as date);
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000"

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/6
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-24 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

This change extends the error message Impala yields when casting STRING
to DATE (explicitly or implicitly) fails. The new error message includes
the violating string value.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Example:
select ('2000' as Date);
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000"

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/5
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 5
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-21 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

When users cast string to date. It's hard for users to debug right now
if users running millions of rows casting while casting failed without
extra information.

Solution:
Add the failed data value into error message.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Examples:
select ("2000" as Date);
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000"

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/4
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 4
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error

2019-06-20 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/13690 )

Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error
..

IMPALA-8652 Illegal delimiter error in shell has unknown error

Problem:
When assign --output_delimiter to invalid value, the validation of
the argument is done only after the query is running, ValueError is
raised in DelimitedOutputFormatter and caught in _exec_stmt in shell

Solution:
Add --output_delimiter option check before impala-shell initialization
Remove delimiter length check in DelimitedOutputFormatter

Testing:
tests/shell/test_shell_commandline.py passed

Example:
$ impala-shell.sh -B --output_delimiter '||' -q 'select 1,1,1'
Illegal delimiter ||, the delimiter must be a 1-character string.

Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
---
M shell/impala_shell.py
M shell/shell_output.py
2 files changed, 9 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/3
--
To view, visit http://gerrit.cloudera.org:8080/13690
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
Gerrit-Change-Number: 13690
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error

2019-06-20 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/13690 )

Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error
..

IMPALA-8652 Illegal delimiter error in shell has unknown error

Problem:
When assign --output_delimiter to invalid value, the validation of
the argument is done only after the query is running, ValueError is
raised in DelimitedOutputFormatter and caught in _exec_stmt in shell

Solution:
Add --output_delimiter option check before impala-shell initialization
Remove delimiter length check in DelimitedOutputFormatter

Testing:
tests/shell/test_shell_commandline.py passed

Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
---
M shell/impala_shell.py
M shell/shell_output.py
2 files changed, 7 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/2
--
To view, visit http://gerrit.cloudera.org:8080/13690
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
Gerrit-Change-Number: 13690
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error

2019-06-20 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13690


Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error
..

IMPALA-8652 Illegal delimiter error in shell has unknown error

Problem:
When assign --output_delimiter to invalid value, the validation of
the argument is done only after the query is running, ValueError is
raised in DelimitedOutputFormatter and caught in _exec_stmt in shell

Solution:
Add --output_delimiter option check before impala-shell initialization
Remove delimiter length check in DelimitedOutputFormatter

Testing:
tests/shell/test_shell_commandline.py passed

Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
---
M shell/impala_shell.py
M shell/shell_output.py
2 files changed, 8 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/1
--
To view, visit http://gerrit.cloudera.org:8080/13690
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05
Gerrit-Change-Number: 13690
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-20 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

When users cast string to date. It's hard for users to debug right now
if users running millions of rows casting while casting failed without
extra information.

Solution:
Add the failed data value into error message.

Testing:
changes -> date-partitioning.test & date.test
query_test/test_date_queries.py test passed

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
3 files changed, 12 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/3
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 3
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-20 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@310
PS2, Line 310: DateVal CastFunctions::CastToDateVal(FunctionContext* ctx, const 
StringVal& val) {
> Please cover this change with tests. If you grep for "String to Date parse
I have already tested all the tests you mentioned. They all passed. The way 
tests handle this is to find a string match in the output. So without changing 
the old tests, this still passes.

Do you want me to change the tests cases? (i.e. add the additional information 
for string match)


http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@314
PS2, Line 314: (char *)
> The preferred way of converting here is using reinterpret_cast as seen abov
Done


http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316
PS2, Line 316: invalidVal
> Have you tried to simply pass the char* here? Is that a null terminated str
It's not working because StringVal.ptr is not a null-terminated string.


http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316
PS2, Line 316: c_str()
> Do you need to convert the return value of Substitute() to char*?
The reason why I used Substitute and .c_str() is that I am following the other 
SetError() examples. The following is one of the example that use SetError() 
function to pass a variable inside a string. To use SetError, char* is required.

agg_fn_ctx_->SetError(Substitute("UDA Serialize() and Finalize() must return 
same pointer as input for $0 intermediate", 
dst_slot_desc.type().DebugString()).c_str());



--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Comment-Date: Thu, 20 Jun 2019 18:14:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-19 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/13680 )

Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

It's hard for users to debug right now if users running millions of
rows casting to date while cast failed without extra information.

Solution:
Adding a fail data value is at least a minimum requirement.

Testing:
query_test/test_date_queries.py test passed

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
1 file changed, 4 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/2
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 2
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails

2019-06-19 Thread Jiawei Wang (Code Review)
Jiawei Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13680


Change subject: IMPALA-8665 Include extra info in error message when date cast 
fails
..

IMPALA-8665 Include extra info in error message when date cast fails

It's hard for users to debug right now if users running millions of
rows casting to date while cast failed without extra information.

Solution:
Adding a fail data value is at least a minimum requirement.

Testing:
query_test/test_date_queries.py test passed

Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
---
M be/src/exprs/cast-functions-ir.cc
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/1
--
To view, visit http://gerrit.cloudera.org:8080/13680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04
Gerrit-Change-Number: 13680
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab