[Impala-ASF-CR] IMPALA-9381: lazy conversion of runtime profile
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15236 ) Change subject: IMPALA-9381: lazy conversion of runtime profile .. Patch Set 3: Code-Review+1 This is so good! I want to do the same when I wrote the runtimeprofile ToJson() code. But I was so lack of the code base and C++ knowledge so I couldn't find a way to do it. This is much more efficient! -- To view, visit http://gerrit.cloudera.org:8080/15236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f5133cc146adc3b044cf4b64aae0a9688449fa Gerrit-Change-Number: 15236 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Comment-Date: Mon, 24 Feb 2020 18:38:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 11: Hi Tim, Seems like there is a test failure JdbcTest.testConcurrentSessionMixedIdleTimeout:656 Not sure if this is a flaky test or something. Can we try to run the test again? -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 11 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 14 Dec 2019 18:55:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc@54 PS8, Line 54: "The maximum number of bytes to display on the debug webserver's log page"); > Oh right, that happens if you have a declaration in a header, but no defini Always good to learn knowledge like these. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 10 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 13 Dec 2019 20:13:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#10). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are also annotated with their significance to users. * STABLE_HIGH - High level and stable counters, always useful for measuring query performance and status. Counters that everyone is interested. should rarely change and if it does we will make some effort to notify users. * STABLE_LOW - Low level and stable counters. Interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstances for users. * Unstable - Unstable but useful. Useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. MaterializeTupleTimer * Debug - Debugging counters. Generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 678 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/10 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 10 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 9: (4 comments) Hi Tim, thanks for the review! http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/default-path-handlers.cc@54 PS8, Line 54: constexpr ProfileEntryPrototype::Significance ProfileEntryPrototype::ALLSIGNIFICANCE[]; > Include util/runtime-profile-counters.h instead of redeclaring it. I have tried this way but it won't compile... The error is error: undefined reference to 'impala::ProfileEntryPrototype::ALLSIGNIFICANCE' I am not sure if I did it wrong? http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@109 PS8, Line 109: fo > nit: for Done http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@115 PS8, Line 115: "; > Maybe remove this sentence? Done http://gerrit.cloudera.org:8080/#/c/14776/8/be/src/util/runtime-profile.cc@115 PS8, Line 115: circumstance > nit: circumstances Done -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 13 Dec 2019 20:04:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#9). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are also annotated with their significance to users. * STABLE_HIGH - High level and stable counters, always useful for measuring query performance and status. Counters that everyone is interested. should rarely change and if it does we will make some effort to notify users. * STABLE_LOW - Low level and stable counters. Interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstances for users. * Unstable - Unstable but useful. Useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. MaterializeTupleTimer * Debug - Debugging counters. Generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 676 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/9 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#8). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are also annotated with their significance to users. * STABLE_HIGH - High level and stable counters, always useful on measuring query performance and status. Counters that everyone is interested. should rarely change and if it does we will make some effort to notify users. * STABLE_LOW - Low level and stable counters. Interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstance for users. Lots of developers are interested. * Unstable - Unstable but useful. Useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. MaterializeTupleTimer * Debug - Debugging counters. Generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 676 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/8 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 8 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 7: (4 comments) Hi Tim, Balazs Thanks for the review! Here is another round code changes. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@72 PS5, Line 72: : PROFILE_DEFINE_DERIVED_COUNTER(PerReadThreadRawHdfsThrou > As mentioned for the previous counter, all 'total wall clock time' metrics Done, also changed to the same pattern on TotalRawHBaseReadTime. Run "git grep" and did not find other places that have the same descriptions. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85 PS5, Line 85: average nu > Yep, there's a single queue (with multiple I/O threads consuming work from Okay, then the original text makes sense here I think. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@88 PS5, Line 88: ry in exchange no > I was recommending '(across the network)' in parens specifically because it Thanks for explanation. Done http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl File www/profile_docs.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl@29 PS5, Line 29: Name : Significance : Unit : Description : : : : {{#profile_docs}} : : : {{name}} : : : {{significance}} : : : {{unit}} : : : {{description}} : : > IIUC, right now, the table will only have the significance level (e.g. STAB Done. Add the section. Screenshot here: https://drive.google.com/open?id=1hu1tRe2fFwrb_CIkce_g7sflI7FVZTLk -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Dec 2019 23:18:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#7). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are also annotated with their significance to users. * STABLE_HIGH - High level and stable counters, always useful on measuring query performance and status. Counters that everyone is interested. should rarely change and if it does we will make some effort to notify users. * STABLE_LOW - Low level and stable counters. Interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstance for users. Lots of developers are interested. * Unstable - Unstable but useful. Useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. MaterializeTupleTimer * Debug - Debugging counters. Generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 598 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/7 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 7: (7 comments) Hi Tim Thanks for the feedback. Have adopted the changes. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@79 PS5, Line 79: d collection items read by the scan. Only created f > Yeah you should remove this, it's only included in scan nodes where its rel Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85 PS5, Line 85: average nu > maybe "each type of remote filesystem"? Another question "Is HDFS remote read overall a single disk queue, or individual remote read targets? Same for S3" http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@88 PS5, Line 88: ty of the sys > I think I wrote the original text and it's hard to quantity - it's really t Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@121 PS5, Line 121: time waiting for I/O instead of " : "processing data. Note that this includes the > traditional here means MT_DOP = 0. I guess this is kinda an implementation Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@58 PS5, Line 58: COUNTER(RowsRead, STABLE_HIGH, TUnit::UNIT, "Number of top-level " : "rows/tuples read fr > Yeah you could drop the "only implemented" sentence, since it won't appear Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@97 PS5, Line 97: OUBLE_VALUE, : "The ratio between TotalScanByteSent and TotalBytesRead, i.e > Yeah I think it's comparing apples and oranges anyway, since one is on-disk Remove the sentence "i.e" for now. Maybe Lars can give some thoughts. http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl File www/query_profile.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl@40 PS5, Line 40: iv> : : {{profile}} : :
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 6: (27 comments) Hi Balazs, Thanks so much for help correct the profile counters descriptions! That's really helpful! Agree that we need to keep code out of stable counters descriptions. Unfortunately, I only have so little knowledge on the scan counters so that a lot of counters I am not sure how to fix the description. Ask @Tim Armstrong and @Lars to help take a look on that. Also, if you have strong options how to update the counters. Please feel free to let me know. Have already applied a few suggestions you made. Thanks Jiawei http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG@37 PS5, Line 37: 2. Profile counters are also annotated with their significance to users. : * STABLE_HIGH - High level and stable counters, always useful on measuring : query performance and status. Counters that everyone is interested. should : rarely change and if it does we will make some effort to notify users. : : * STABLE_LOW - Low level and stable counters. Interesting counters to monitor : and analyze by machine. It will probably be interesting under some : circumstance for users. Lots of developers are interested. : : * Unstable - Unstable but useful. Useful to understand query performance, : but subject to change, particularly if the implementation changes. : E.g. MaterializeTupleTimer : : * Debug - Debugging counters. Generally not useful to users of Impala, : the main use case is low-level debugging. Can be hidden to reduce noise : for most consumers of profiles. : : 3. We have around 250 counters. This commit did the replacement in : scan-node and hdfs-scan-node-base and coordinator. : > Please simplify this to what's in the code - I think the descriptions in Si Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@70 PS5, Line 70: across all Disk I/O threads in HDFS read operations."); : PROFILE_DEFINE_TIMER(TotalRawHdfsOpenFileTime, STABLE_LOW, "The tota > Instead of including an explanation in individual description (but not in e Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@79 PS5, Line 79: lectionItemsRead, STABLE_LOW, TUnit::UNIT, > Is this included in profiles where it's irrelevant? If no, remove this to a Not sure about it. @Lars and @Tarmstrong maybe can help to take a look? http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85 PS5, Line 85: _DEFINE_SA > Maybe '...remote data source...'? Disk queue is not a term users will be fa @Lars and @Tarmstrong http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@88 PS5, Line 88: ty of the sys > Than what? - Is there a point of comparison we can provide? For example, 'v Not sure... if someone have better knowledge than this might want to fix it... http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@89 PS5, Line 89: the s > scan Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@90 PS5, Line 90: it::BYTES, > I don't think we should speculate on root causes - there can be many reason Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@121 PS5, Line 121: the scanner thread was ready to process " : "the data. High values show that scanner threa > Isn't this the same as 'HDFS scans'? IDK to be honest... @Tarmstrong and @Lars http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@125 PS5, Line 125: pressedBytesReadPerColumn, STABLE_LOW, : TUnit::BYTES, "Stats a > replace: Note that this includes the time when Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@130 PS5, Line 130: , "Stats about the number of compressed bytes read per column. " : "Each sample in the counter is the size of a single column that is scanned by the " : "scan node."); > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@136 PS5, Line 136: of data cache partially hit"); : PROFILE_DEFINE_COUNTER(DataCacheMissCount, STABLE_HIGH, TUnit::UNIT, : "Total count of data > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: ht
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#6). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are also annotated with their significance to users. * STABLE_HIGH - High level and stable counters, always useful on measuring query performance and status. Counters that everyone is interested. should rarely change and if it does we will make some effort to notify users. * STABLE_LOW - Low level and stable counters. Interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstance for users. Lots of developers are interested. * Unstable - Unstable but useful. Useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. MaterializeTupleTimer * Debug - Debugging counters. Generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/queries.tmpl M www/query_profile.tmpl 16 files changed, 605 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/6 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#5). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are annotated with their stability: * Stable counters - generally useful to understand query performance, should only change rarely and if it does we'll make some effort to notify users. E.g. BytesRead. * Unstable but useful - useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. RowBatchQueuePutWaitTime, MaterializeTupleTimer * Debugging counters - generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. Profile counters are also annotated with their significance to users. * Critical level counters - always useful on measuring query performance and status. Counters that everyone are interested. * High level counters - generally interesting counters. Most of the users will be interested and all the developers are very interested. * Medium level counters - somehow interesting counters to monitor. It will probably be interesting under some circumstance. Lot of developers are interested. * Low level counters - not interesting to users. Should be useful for developers to debug only. 4. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 616 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/5 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 4: (1 comment) Thanks for the advice! http://gerrit.cloudera.org:8080/#/c/14776/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14776/3//COMMIT_MSG@10 PS3, Line 10: generating counters from a counter registry. All the profile counters > Sorry I wrote comment on wrong review... Thanks for the suggestions. I edit the commit message. Hopefully it will be more clear to you... I will keep editing it in the following commits. -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 04 Dec 2019 22:58:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Hello Andrew Sherman, Lars Volker, David Rorke, Balazs Jeszenszky, Jiawei Wang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14776 to look at the new patch set (#4). Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This change changes the way developers define profile counters by generating counters from a counter registry. All the profile counters will be register there first and then used in the same way as before. By doing so, we will be able to manage profile counters in a way that we can define profile counters documentation. For example: Declaration: PROFILE_DEFINE_COUNTER(NumBackends, STABLE_HIGH, TUnit::UNIT, "Number of backends running this query."); Initialization: COUNTER_SET(PROFILE_NumBackends.Instantiate(query_profile_), num_backends); This shall be how we define a NumBackends counter. It follows with its significance, type, description in the declaration part. Users now will be able to view profile counters documentation under query_profile page, there is a Profile Documentation button which leads to /profile_docs. More details: This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are annotated with their stability: * Stable counters - generally useful to understand query performance, should only change rarely and if it does we'll make some effort to notify users. E.g. BytesRead. * Unstable but useful - useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. RowBatchQueuePutWaitTime, MaterializeTupleTimer * Debugging counters - generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. Profile counters are also annotated with their significance to users. * Critical level counters - always useful on measuring query performance and status. Counters that everyone are interested. * High level counters - generally interesting counters. Most of the users will be interested and all the developers are very interested. * Medium level counters - somehow interesting counters to monitor. It will probably be interesting under some circumstance. Lot of developers are interested. * Low level counters - not interesting to users. Should be useful for developers to debug only. 4. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. Concers: The downside is that we will have duplicate comments of query profiles both in the header file and the .cc file. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 619 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/4 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has abandoned this change. ( http://gerrit.cloudera.org:8080/14837 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/14837 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355 Gerrit-Change-Number: 14837 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 2: (10 comments) Hi Tim, thanks for the advice. I recently updated my Github Username and that lead the Gerrit not able to recognize my email and I dont find a way to correct that... So I have to create a new PR using another email. https://gerrit.cloudera.org/#/c/14837/ The code is here... Sorry for the inconvenience. http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/hbase-scan-node.cc File be/src/exec/hbase-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/hbase-scan-node.cc@39 PS2, Line 39: PROFILE_DEFINE_TIMER(TotalRawHBaseReadTime, DEBUG, > I think this could be a stable counter (I can see either low or high level) Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.h File be/src/exec/scan-node.h: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.h@35 PS2, Line 35: /// Includes ScanNode common counters: > I think it's probably useful to keep these comments around for the time bei Done. I am adding all the comments back. http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@56 PS2, Line 56: STABLE_LOW > I think this should be STABLE_HIGH, I think the number of bytes read is gen Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@72 PS2, Line 72: PROFILE_DEFINE_COUNTER(NumScannerThreadsStarted, STABLE_LOW, TUnit::UNIT, > I would mark this as UNSTABLE or DEBUG - it's a pretty useless counter in p Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@92 PS2, Line 92: DEBUG > STABLE_LOW. This and PeakScannerThreadConcurrency are pretty useful for und Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/exec/scan-node.cc@95 PS2, Line 95: DEBUG > STABLE_LOW Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator-backend-state.cc@61 PS2, Line 61: // TODO: the counters are inside the impala namespace, so we need to declare them there, > Need to remove TODO before merging. I think it makes sense just to switch f Sure, I will remove it. I was just thinking leave it so that the following up tasks can be done in the same way. http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/runtime/coordinator.cc@80 PS2, Line 80: STABLE_HIGH > Probably STABLE_LOW, this is a little tricky to interpret Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/default-path-handlers.cc@130 PS2, Line 130: // name, description, unit > Comment needs updating? Done http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/14776/2/be/src/util/runtime-profile.cc@68 PS2, Line 68: boost::lock_guard l(lock_); > nit: don't need to use boost:: and std:: prefixes for lock_guard and vector Done -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 04 Dec 2019 22:22:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14837 Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This is the work based on Lars' experiment on https://gerrit.cloudera.org/#/c/12116/ This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are annotated with their stability: * Stable_HIGH counters - High level and stable counters. Always useful on measuring query performance and status. Counters that everyone is interested. Should rarely change and if it does we will make some effort to notify users. * Stable_LOW - Low level and sable counters - interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstance for users. Lots of developers are interested. * Unstable but useful - useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. RowBatchQueuePutWaitTime, MaterializeTupleTimer * Debugging counters - generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. The downside is that we'd reduce the comments that currently explain some of the counters in header files by moving them to the .cc files. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 619 insertions(+), 273 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/14837/1 -- To view, visit http://gerrit.cloudera.org:8080/14837 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I72d2380eff95b8d982ce23b9f0252810b495f355 Gerrit-Change-Number: 14837 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 2: (9 comments) Hi Tim, Thanks for the advice. The second round combined stability with significance. Now counters have 4 types which are STABLE_HIGH, STABLE_LOW, UNSTABLE, DEBUG Also, the second round I replaced all counters in scan-node.cc One thing I am not sure is that whether we should remove the previous comments on all the counters? Please let me know if you have other thoughts. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525 PS1, Line 525: /// Disk accessed bitmap > These are weird cause they're not actually included in the profile directly Done http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127 PS1, Line 127: PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, STABLE_LOW, > I'm OK with the redundant text, but it might be reasonable to use a templat hmm, I will just leave it here then because we only have two counters and the contents are subject to change. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82 PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, "Wall clock time that the " > We could probably define an enum or something like that. I'd be ok with jus I will just leave it in the description. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@95 PS1, Line 95: PROFILE_DEFINE_HIGH_WATER_MARK_COUNTER(PeakScannerThreadConcurrency, DEBUG, > I think doing this makes sense - that would force all counters to be define Done http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc@60 PS1, Line 60: > This approach of converting the namespaces where needed seems fine to me. T Okay, will adopt the second approach whenever needed. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@154 PS1, Line 154: // Debugging counters - generally not useful to users of Impala, the main use case is > nit: formatting is weird, I'd expect the enum values to only be indented tw Done http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164 PS1, Line 164: const char* name() const { return name_; } > I think we should probably combine them, because they're pretty correlated Okay, I will split stable into high_stable and low_stable and keep unstable and debug the same. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@189 PS1, Line 189: /// prototypes will register with the singleton instance of this class. Then, this class > Are you planning to do this? I dont think so, unless you feel like we will need this. Otherwise I will just delete this http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl File www/profile_docs.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1 PS1, Line 1:
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This is the work based on Lars' experiment on https://gerrit.cloudera.org/#/c/12116/ This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are annotated with their stability: * Stable_HIGH counters - High level and stable counters. Always useful on measuring query performance and status. Counters that everyone is interested. Should rarely change and if it does we will make some effort to notify users. * Stable_LOW - Low level and sable counters - interesting counters to monitor and analyze by machine. It will probably be interesting under some circumstance for users. Lots of developers are interested. * Unstable but useful - useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. RowBatchQueuePutWaitTime, MaterializeTupleTimer * Debugging counters - generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. This is still a WIP work and all the advices are welcomed! The downside is that we'd reduce the comments that currently explain some of the counters in header files by moving them to the .cc files. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl M www/query_profile.tmpl 15 files changed, 620 insertions(+), 292 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/2 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 9: rebase the commit to solve merge conflict. -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 22 Nov 2019 05:37:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#9). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable A. Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. B. Solution: - We added "hms-load-tbl-schema", "load-duration.all-column-stats", "load-duration.all-partitions.total-time", "load-duration.all-partitions.file-metadata". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration.total-time 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: load-duration.all-column-stats 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: we rename it to load-duration.storage-metadata. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: load-duration.all-partitions 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: load-duration.all-partitions.file-metadata C. Extra thing in this commit: 1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd 2. Add explanation for table loading manager D. Test: 1. Add Unit tests for PrintUtils.printTime() function 2. Manual describe table and verify the table loading metrics are correct. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 10 files changed, 186 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/9 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 ) Change subject: IMPALA-7550: Add documentation to profile counters .. Patch Set 1: (6 comments) Hi Tim, Lars, other Impala team members This is the first edition of impala profile counters that I had based on Lars' previous work on https://gerrit.cloudera.org/#/c/12116/ Thanks so much for Lars building the basis. I had several comments on the commit itself. Any suggestions on how to improve it is highly welcomed! Thanks so much! http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525 PS1, Line 525: /// Disk accessed bitmap I am not too sure about the following two counters. Wondering if we should just leave it like it is. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127 PS1, Line 127: PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, STABLE, This seems redundant, any advice to modify it will be appreciated. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@139 PS1, Line 139: PROFILE_DEFINE_COUNTER(DataCacheHitCount, STABLE, HIGH, TUnit::UNIT, Did not find the explanation for the following data cache counters. Might need some other better explanation. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82 PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, LOW, "Wall clock time that the " Hi, Tim I believe these are the Timer you mentioned in the threads. Any suggestions on how should we handle it other than put it in the descriptions? http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164 PS1, Line 164: enum class Significance { I add this Significance to measure if the counter is important to a user. Maybe we can also combine this with stability or redefine the stability? I am not too sure about this. http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl File www/profile_docs.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1 PS1, Line 1:
[Impala-ASF-CR] IMPALA-7550: Add documentation to profile counters
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14776 Change subject: IMPALA-7550: Add documentation to profile counters .. IMPALA-7550: Add documentation to profile counters This is the work based on Lars' experiment on https://gerrit.cloudera.org/#/c/12116/ This commit did the following refactors on profile counters. 1. Add a singleton registry for runtime profile counters prototypes, similiar to what Kudu does for metrics. This allows us to generate profile documentation for all counters from the code. We add /profile_docs and a correspoding UI for the documentation of profile counters. 2. Profile counters are annotated with their stability: * Stable counters - generally useful to understand query performance, should only change rarely and if it does we'll make some effort to notify users. E.g. BytesRead. * Unstable but useful - useful to understand query performance, but subject to change, particularly if the implementation changes. E.g. RowBatchQueuePutWaitTime, MaterializeTupleTimer * Debugging counters - generally not useful to users of Impala, the main use case is low-level debugging. Can be hidden to reduce noise for most consumers of profiles. 3. Profile counters are also annotated with their significance to users. * Critical level counters - always useful on measuring query performance and status. Counters that everyone are interested. * High level counters - generally interesting counters. Most of the users will be interested and all the developers are very interested. * Medium level counters - somehow interesting counters to monitor. It will probably be interesting under some circumstance. Lot of developers are interested. * Low level counters - not interesting to users. Should be useful for developers to debug only. 4. We have around 250 counters. This commit did the replacement in scan-node and hdfs-scan-node-base and coordinator. This is still a WIP work and all the advices are welcomed! The downside is that we'd reduce the comments that currently explain some of the counters in header files by moving them to the .cc files. Additionally a (arguably good) limitation is that profile counter names need to be unique. Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 --- M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h D be/src/util/debug-counters.h M be/src/util/default-path-handlers.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc A www/profile_docs.tmpl 12 files changed, 671 insertions(+), 296 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/14776/1 -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#7). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable A. Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. B. Solution: - We added "hms-load-tbl-schema", "load-duration.all-column-stats", "load-duration.all-partitions.total-time", "load-duration.all-partitions.file-metadata". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration.total-time 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: load-duration.all-column-stats 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: we rename it to load-duration.storage-metadata. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: load-duration.all-partitions 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: load-duration.all-partitions.file-metadata C. Extra thing in this commit: 1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd 2. Add explanation for table loading manager D. Test: 1. Add Unit tests for PrintUtils.printTime() function 2. Manual describe table and verify the table loading metrics are correct. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 10 files changed, 186 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/7 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 7: (3 comments) Thanks for the review. Let me know if we need add anything else in this commit. http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@163 PS6, Line 163: load-duration.all-partitions"; > I understand that the naming is used to represent nesting of the metrics. S Done http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/Table.java@149 PS6, Line 149: load-duration"; > Since this is the top level load-duration adding a specific --total-time is Done http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java: http://gerrit.cloudera.org:8080/#/c/14611/6/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@262 PS6, Line 262: There is a discussion here: https://issues.apache.org/jira/brows > May be just say (See discussion in ..) :) Done -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 21 Nov 2019 20:28:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#6). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable A. Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. B. Solution: - We added "hms-load-tbl-schema", "load-duration.all-column-stats", "load-duration.all-partitions.total-time", "load-duration.all-partitions.file-metadata". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration.total-time 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: load-duration.all-column-stats 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: we rename it to load-duration.storage-metadata. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: load-duration.all-partitions 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: load-duration.all-partitions.file-metadata C. Extra thing in this commit: 1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd 2. Add explanation for table loading manager D. Test: 1. Add Unit tests for PrintUtils.printTime() function 2. Manual describe table and verify the table loading metrics are correct. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 10 files changed, 188 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/6 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#5). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. Solution: - We added "hms-load-tbl-schema", "load-duration-all-column-stats", "all-partitions-load-duration", "file-metadata-all-partitions-load-duration". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: all-column-stats-load-duration 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: storage-metadata-load-duration. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: all-partitions-load-duration 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: file-metadata-all-partitions-load-duration - Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd Test: 1. Will add metrics test after the few rounds of code review. 2. Add Unit tests for PrintUtils.printTime() function Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 8 files changed, 153 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/5 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. Solution: - We added "hms-load-tbl-schema", "load-duration-all-column-stats", "all-partitions-load-duration", "file-metadata-all-partitions-load-duration". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: all-column-stats-load-duration 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: storage-metadata-load-duration. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: all-partitions-load-duration 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: file-metadata-all-partitions-load-duration - Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd Test: 1. Will add metrics test after the few rounds of code review. 2. Add Unit tests for PrintUtils.printTime() function Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 8 files changed, 156 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/4 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. Solution: - We added "hms-load-tbl-schema", "load-duration-all-column-stats", "all-partitions-load-duration", "file-metadata-all-partitions-load-duration". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: all-column-stats-load-duration 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: storage-metadata-load-duration. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: all-partitions-load-duration 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: file-metadata-all-partitions-load-duration - Extra function: Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd Test: 1. Will add metrics test after the few rounds of code review. 2. Add Unit tests for PrintUtils.printTime() function Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 8 files changed, 156 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/3 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 11: (2 comments) Hi Vihang, thanks for the feedback. Let me know if you need more clarification. http://gerrit.cloudera.org:8080/#/c/14600/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/14600/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@993 PS11, Line 993: "warning threshold. Time: " + load_time_duration + " ns"); I implemented a PrettyPrint() function on front-end in IMPALA-9110. Will refactor the time printing in that ticket. http://gerrit.cloudera.org:8080/#/c/14600/10/www/scripts/util.js File www/scripts/util.js: http://gerrit.cloudera.org:8080/#/c/14600/10/www/scripts/util.js@52 PS10, Line 52: // if hour is true, the time is large enough and we should : // ignore the remaining time on second level : if (!hour && value >= 1000) { : re += (Math.floor(value / 1000) + "s"); : > More comments will be helpful to understand this function better. Why do we The output for this will be 10h. I am using the same logic as https://github.com/apache/impala/blob/288abf10f6142ae6cb02329604805a9a1dcc804f/be/src/util/pretty-printer.h#L249 I think if "hour" is true, then the time will be large enough and we should ignore the seconds? That should be the original idea. -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 11 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 08 Nov 2019 05:58:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#11). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 307 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/11 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 11 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 2: (1 comment) Also, I renamed the startTableLoadingThreads to startTableLoadingSubmitterThreads. The current code is kind of confusing to understand... http://gerrit.cloudera.org:8080/#/c/14611/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java: http://gerrit.cloudera.org:8080/#/c/14611/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@244 PS2, Line 244: //LOG.info(); nit: remove this -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 06 Nov 2019 06:04:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 2: (4 comments) Thanks for the feedback! I add the time for measuring HMS loading table schema. For the table loading queue time, Quanlong suggest we should do the following: "Table queue time is not related to the table itself. But just like query queue time, it's an important metric to understand why the loading is slow. I think we can track it as a global metric that not bind to any tables. Showing p90th table loading queue time in recent time window (e.g. 1h) can be helpful." It makes a lot of sense to me. But I am wondering how will we approve this. Might need some guidance on getting the queue time done. http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@962 PS1, Line 962: > Does this also account for the time taken for loading file metadata? If yes I am not sure if I understand the question here. The time taken for updatePartitionsFromHms() will also contribute to storageMetadataLoadTime_ (which is time spent in the source systems loading/reloading the fs metadata for the table) And we have this metrics "storage-metadata-load-duration" to keep track of the storageMetadataLoadTime_. http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@975 PS1, Line 975: _ = load > I would recommend something like Time Taken: ns. here. Also, loggin Adding a PrettyPrint function for Time in frontend. Done. http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1229 PS1, Line 1229: /** > I think this function does not invoke any HMS RPCs so the time will be very Done http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java@334 PS1, Line 334: } > I don't think there is much value in having this since we don't expect this Done -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 06 Nov 2019 06:00:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#2). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. Solution: - We added "hms-load-tbl-schema", "load-duration-all-column-stats". Also, we logged the loadValidWriteIdList() time. And we found that "storage-metadata-load-duration" has already represent partition and file metadata loading. So now we have a more detailed breakdown time for table loading info. - Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd Test: 1. Will add metrics test after the few rounds of code review. 2. Add Unit tests for PrintUtils.printTime() function Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 8 files changed, 148 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/2 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#9). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 297 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/9 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 1: (1 comment) The table loading process is kind of confusing to anyone reading it for the first time. Here are some good explanation from Quanlong: " There are two kinds of table loading requests: 1) Async: load requests from invalidate metadata when --load_table_in_background=true. PrioritizedLoad requests from coordinators when missing table meta in analysis. 2) Sync: load requests from DDLs and some RPCs like getPartitionStats and getPartialCatalogObject. All async requests are put in tableLoadingDeque_ and deduplicated by tableLoadingBarrier_. tableLoadingDeque_ is consumed by threads launched in startTableLoadingThreads(). These threads call CatalogServiceCatalog.getOrLoadTable() finally to load the table. Note that getOrLoadTable() is sync. It only returns when table already loaded or loading finish. All sync requests will also go to CatalogServiceCatalog.getOrLoadTable(). All loading threads are running in tblLoadingPool_. So threads in startTableLoadingThreads() are just submitters for background load requests. All table loading threads are in tblLoadingPool" http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java@30 PS1, Line 30: import com.codahale.metrics.Timer; > nit: looks like we import this in below section just before "import com.goo Done -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 05 Nov 2019 18:22:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#8). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 297 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/8 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 8 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 297 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/7 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#6). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 269 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/6 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 5: (3 comments) Thanks for the valid feedback! I added max, 75, 95, 99, count in the new loading time ranking table. The ranking is sorted by maximum loading time by default. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. Users can sort them by other metrics in UI. Also, the more detailed metrics can be found in the metrics column. So I don't feel like we need to expose all of them. http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift@724 PS4, Line 724: 5: optional i64 median_table_loading_ns > It'd be better to not just expose the median loading time. Since this is sh Done http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java@203 PS4, Line 203: > Can we add more metrics so in the future we don't need to touch this part a Done http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl File www/catalog.tmpl: http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl@169 PS4, Line 169: Median Loading Time > We can show human readable time in this column, i.e. in the forms of 11m25s Done -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 22:42:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. And users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 266 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/5 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 4: (1 comment) Thanks for the feedbacks! http://gerrit.cloudera.org:8080/#/c/14600/3/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/14600/3/tests/webserver/test_web_pages.py@378 PS3, Line 378: loading_tables = response_json["longest_loading_tables"] > nit: Could you change the variable name? Done -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 01 Nov 2019 18:15:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time. Add the sorted table in Catalog server web-ui. The loading time is cacualted by the median from load_duration metrics. Testing: - Add end-to-end test for webpage to verify the label and text exisit in catalog debug page. - Launch Impala and activate some tables to see the table loading time showed successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl 9 files changed, 145 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/4 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14611 Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. We added "table-schema-load-duration", "all-column-stats-load -duration", "all-column-stats-load-duration". And we found that "storage-metadata-load-duration" has already represent partition and file metadata loading. So now we have a more detailed breakdown time for table loading info. Test: Will add test after the first round review. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java 2 files changed, 60 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/1 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time. Add the sorted table in Catalog server web-ui. The loading time is cacualted by the median from load_duration metrics. Testing: - Add end-to-end test for webpage to verify the label and text exisit in catalog debug page. - Launch Impala and activate some tables to see the table loading time showed successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl 9 files changed, 145 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/3 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14589 to look at the new patch set (#3). Change subject: IMPALA-9105: Catalog debug page top-n table has a URL generation issue .. IMPALA-9105: Catalog debug page top-n table has a URL generation issue What's the problem: Previously, catalogd debug page does not have 'fqtn' property. Because CatalogServer::GetCatalogUsage() does not pass fqtn to the template. Thus, the generated page table detail page cannot be opened. Also, Someone might accedentially deleted which makes the catalogd UI top-K page unorganized. Fix: Correct the URL generation part. Test: Open the page and all the tables object can be correctly opened. Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 --- M www/catalog.tmpl 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/3 -- To view, visit http://gerrit.cloudera.org:8080/14589 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 Gerrit-Change-Number: 14589 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14600 Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time. Add the sorted table in Catalog server web-ui. The loading time is cacualted by the median from load_duration metrics. Testing: Launch Impala and activate some tables to see the table loading time showed successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M www/catalog.tmpl 8 files changed, 132 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/2 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang
[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14589 to look at the new patch set (#2). Change subject: IMPALA-9105: Catalog debug page top-n table has a URL generation issue .. IMPALA-9105: Catalog debug page top-n table has a URL generation issue What's the problem: Previously, catalogd debug page does not have 'fqtn' property. Because CatalogServer::GetCatalogUsage() does not pass fqtn to the template. Thus, the generated page table detail page cannot be opened. Also, Someone might accedentially deleted which makes the catalogd UI top-K page unorganized. Fix: Correct the URL generation part. Test: Open the page and all the tables object can be correctly opened. Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 --- M www/catalog.tmpl 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/2 -- To view, visit http://gerrit.cloudera.org:8080/14589 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 Gerrit-Change-Number: 14589 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14589 ) Change subject: IMPALA-9105: Catalog debug page top-n table has a URL generation issue .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl File www/catalog.tmpl: http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl@42 PS1, Line 42: > Oops! Yeah, I think someone accidentally deleted this. http://gerrit.cloudera.org:8080/#/c/14589/1/www/catalog.tmpl@47 PS1, Line 47:{{name}} > Is {{fqtn}} empty here causing the problem? Looks like we still use {{fqtn} Yes, fqtn empty caused the issue. The problem is that these "Top-n" tables are passed in a different function in CatalogServer::GetCatalogUsage. And we did not pass fqtn there. -- To view, visit http://gerrit.cloudera.org:8080/14589 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 Gerrit-Change-Number: 14589 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 31 Oct 2019 03:34:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9105: Catalog debug page top-n table has a URL generation issue
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14589 Change subject: IMPALA-9105: Catalog debug page top-n table has a URL generation issue .. IMPALA-9105: Catalog debug page top-n table has a URL generation issue What's the problem: Previously, catalogd debug page does not have 'fqtn' property. Thus, the generated page table detail page cannot be opened. Fix: Correct the URL generation part. Test: Open the page and all the tables object can be correctly opened. Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 --- M www/catalog.tmpl 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14589/1 -- To view, visit http://gerrit.cloudera.org:8080/14589 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If7416c79baf2e78d6790995e97d9802ec7a8cc37 Gerrit-Change-Number: 14589 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. Patch Set 14: (14 comments) > Uploaded patch set 14. This has been a long process, thanks for your patience on this! http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc@848 PS13, Line 848: Of > offset: $1 Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/service/impala-server.cc@2026 PS13, Line 2026: json_profile(rapidj > json_profile. No need for "_rapid". Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc File be/src/util/runtime-profile-test.cc: http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1166 PS13, Line 1166: RuntimeProfile* profile_ab = RuntimeProfile::Create(&pool, "ProfileAb"); > The name of the profile should probably match the variable name. Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1204 PS13, Line 1204: for (auto& itr : content["counters"].GetArray()) { : // check normal Counter > Use iterator style like the following. Same at other places. Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile-test.cc@1218 PS13, Line 1218: } > Do we want to check there are no other counter types in content ? Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h@622 PS13, Line 622: A map of counters name to counter > A map of counter's name to counter Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.h@625 PS13, Line 625: ame, const Counte > This needs to be documented too. Done http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/13801/13/be/src/util/runtime-profile.cc@704 PS13, Line 704: e information > What does all the other information mean ? Done http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@571 PS13, Line 571:download_link = "query_profile_plain_text?query_id={0}".format(query_id) : assert down > nit: one line Done http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@587 PS13, Line 587: > nit: indent 4 for line continuation Done http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@594 PS13, Line 594: le:{0}". > Downloaded ? Done http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@595 PS13, Line 595: the query id > Json pofile Done http://gerrit.cloudera.org:8080/#/c/13801/13/tests/webserver/test_web_pages.py@597 PS13, Line 597: > assert query_id in json_res["contents"]["profile_name"], json_res to help d Done http://gerrit.cloudera.org:8080/#/c/13801/13/www/query_profile.tmpl File www/query_profile.tmpl: http://gerrit.cloudera.org:8080/#/c/13801/13/www/query_profile.tmpl@28 PS13, Line 28: (Available Form > (available formats): Done -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 14 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 06 Aug 2019 20:12:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Future compatibility: The schema of the JSON format can be changed in the future with the standardization of Profile and Counter structure. Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 646 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/14 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 14 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Future compatibility: The schema of the JSON format can be changed in the future with the standardization of Profile and Counter structure. Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 649 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/13 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 13 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Future compatibility: The schema of the JSON format can be changed in the future with the standardization of Profile and Counter structure. Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 642 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/12 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 12 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Future compatibility: The schema of the JSON format can be changed in the future with the standardization of Profile and Counter structure. Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 642 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/11 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 11 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. Patch Set 10: (23 comments) Thanks for the valuable feedback! Looking forward to the next step and the version control discussion. http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-hs2-server.cc@988 PS9, Line 988: } else { > nit: missing blank space after } Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@101 PS9, Line 101: #include "common/names.h" : : using boost::adopt_lock_t; : using boost::algorithm::is_an > Why not group these together with other rapidjson #include above ? There se Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@783 PS9, Line 783: turn Status::OK(); > DCHECK_EQ(format, TRuntimeProfileFormat::STRING); Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/service/impala-server.cc@818 PS9, Line 818: > DCHECK_EQ(format, TRuntimeProfileFormat::STRING); Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@126 PS9, Line 126: string CounterType() const override { > void ToJson(rapidjson::Document& document, rapidjson::Value* val) const ove Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@131 PS9, Line 131: > Actually, do you need to output this in json ? It seems sufficient to just Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@165 PS9, Line 165: private: : SampleFunction counter_fn_; : }; : : /// An AveragedCounter maintains a set of counters and its value is the : /// average of the values in that set. The average is updated through calls : /// > If you take the suggestion in Counter::ToJson(), you probably don't need th Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@219 PS9, Line 219: private: : /// Map from counters to their existing values. Modified via UpdateCounter(). : boost::unordered_map counter > See comments in Counter::ToJson() on why we may not need this. Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@222 PS9, Line 222: : /// Current sums of values from counter_value_map_. Only one of these is used, : /// depending on the unit of the counter. current_double_sum_ is used for : /// DOUBLE_VALUE, current_int_sum_ otherwise. : dou > Same question as above. Do you need to store the internal states (i.e. sum) Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@290 PS9, Line 290: /// Summary statistics of values seen so far. > virtual override Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@292 PS9, Line 292: t64_t max_; > Why is "kind" removed ? May be worth adding a comment here. Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@406 PS9, Line 406: void SortEvents() { > Please add a comment about the output format. Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@452 PS9, Line 452: /// “unit” : xxx, > It'd be nice to add a comment about the output format. Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile-counters.h@583 PS9, Line 583: : void Add(int64_t delta) override { : DCHECK(false); : } : : string CounterType() const override { : > Similar to comments above, this may not be needed since Counter::ToJson() m Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.h@123 PS9, Line 123: counter_name, value, kind, unit > is this list actually accurate ? What about "kind" and "unit" ? Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc@709 PS9, Line 709: } > nit: unnecessary blank line Done http://gerrit.cloudera.org:8080/#/c/13801/9/be/src/util/runtime-profile.cc@797 PS9, Line 797: Value event_sequences_rapid(kArrayType); > nit:
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Future compatibility: The schema of the JSON format can be changed in the future with the standardization of Profile and Counter structure. Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 642 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/10 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 10 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. Patch Set 9: (4 comments) Done, thanks for the feedback. I think we need more discussion on the version control thing as Andrew and Bharath mentioned. http://gerrit.cloudera.org:8080/#/c/13801/8/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/13801/8/be/src/util/runtime-profile-counters.h@292 PS8, Line 292: val->RemoveMember("kind"); > Should we not update the kind ? Why remove it? SummaryStatsCounter and TimeSeriesCounter are treated separately with normal counters in our JSON output. So a field like "kind" would be redundant. “counters”: [ { “counter_name”: xxx, “value”: xxx, “kind”: xxx, “unit”: xxx, “child_counters”: [{...}]// same structure as counter … //type special values },{...} ], “summary_stats_counters” : [ { “counter_name”: xxx, “value”: xxx, “unit”: xxx, “min”: xxx, “max”: xxx “avg”: xxx “num_of_samples”: xxx },{...} ] The difference as you can see from the schema, it's that we already know all the counters from summary_stats_counters belongs to SummaryStatsCounter. So it would not be necessary to keep the "kind" field again. http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py@668 PS8, Line 668: > This looks flaky. Instead assert that they exist before you can loop throug Done http://gerrit.cloudera.org:8080/#/c/13801/8/tests/hs2/test_hs2.py@670 PS8, Line 670: if ("child_profiles" not in json_res["contents"]) or \ > dump the json_res if the assert fails, helps with debugging test failures. Done http://gerrit.cloudera.org:8080/#/c/13801/8/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/13801/8/tests/webserver/test_web_pages.py@594 PS8, Line 594: assert False, "Download JSON format query profile cannot be parsed. " \ > dump json Done -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 25 Jul 2019 02:20:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 620 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/9 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 9 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Hello Bharath Vissapragada, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13893 to look at the new patch set (#4). Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. IMPALA-8772: Import Testcase failed for SQL without table refs Description: Query like this: select 5 * 4; can generate valid testcase, but cannot be loaded because it does not involve any table and view references. Fix: Add null check for CatalogOpExecutor when doing testcase data load Tests: tests/metadata/test_testcase_builder Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/metadata/test_testcase_builder.py 2 files changed, 92 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/4 -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13893 ) Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. Patch Set 3: (1 comment) Thanks for the feedback. http://gerrit.cloudera.org:8080/#/c/13893/2/tests/metadata/test_testcase_builder.py File tests/metadata/test_testcase_builder.py: http://gerrit.cloudera.org:8080/#/c/13893/2/tests/metadata/test_testcase_builder.py@51 PS2, Line 51: try: : # Test load testcase works > you can use delete_file_dir from hdfs_util.py . Also please put it in a fin Done -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Comment-Date: Thu, 25 Jul 2019 01:55:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Hello Bharath Vissapragada, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13893 to look at the new patch set (#3). Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. IMPALA-8772: Import Testcase failed for SQL without table refs Description: Query like this: select 5 * 4; can generate valid testcase, but cannot be loaded because it does not involve any table and view references. Fix: Add null check for CatalogOpExecutor when doing testcase data load Tests: tests/metadata/test_testcase_builder Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/metadata/test_testcase_builder.py 2 files changed, 93 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/3 -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13893 ) Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. Patch Set 2: (5 comments) Thanks for your feedback! http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py File tests/metadata/test_testcase_builder.py: http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@41 PS1, Line 41: ate_query = " > execute_query_expect_success Done http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@43 PS1, Line 43: > assert len(result.data) == 1 ? Done http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@48 PS1, Line 48: self.execute_query_expect_success(self.client, testcase_load_query) > Implement the TODO? Done http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@49 PS1, Line 49: > flake8: W391 blank line at end of file Done http://gerrit.cloudera.org:8080/#/c/13893/1/tests/metadata/test_testcase_builder.py@49 PS1, Line 49: > Yep, remove these blank lines? Done -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Comment-Date: Tue, 23 Jul 2019 18:18:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Hello Bharath Vissapragada, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13893 to look at the new patch set (#2). Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. IMPALA-8772: Import Testcase failed for SQL without table refs Description: Query like this: select 5 * 4; can generate valid testcase, but cannot be loaded because it does not involve any table and view references. Fix: Add null check for CatalogOpExecutor when doing testcase data load Tests: tests/metadata/test_testcase_builder Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/metadata/test_testcase_builder.py 2 files changed, 87 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/2 -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8772: Import Testcase failed for SQL without table refs
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13893 Change subject: IMPALA-8772: Import Testcase failed for SQL without table refs .. IMPALA-8772: Import Testcase failed for SQL without table refs Description: Query like this: select 5 * 4; can generate valid testcase, but cannot be loaded because it does not involve any table and view references. Fix: Add null check for CatalogOpExecutor when doing testcase data load Tests: tests/metadata/test_testcase_builder Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e TODO: NEED TO DELETE GENERATED TESTCASE AFTER THIS IS DONE --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/metadata/test_testcase_builder.py 2 files changed, 84 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/13893/1 -- To view, visit http://gerrit.cloudera.org:8080/13893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I50d77d7de44bb140598a1e8db4da85a5ec87f31e Gerrit-Change-Number: 13893 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. Patch Set 8: (17 comments) Thanks so much for the valid feedback. Have adopted the changes you suggested. Also add some more unit tests. Let me know if the code need more changes. Thanks! http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/service/impala-http-handler.cc@270 PS6, Line 270: if (format != TRuntimeProfileFormat::JSON){ > nit: Add a quick comment why? Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@126 PS6, Line 126: > nit: space const { . Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@128 PS6, Line 128: rapidjson::Value::MemberIterator type_itr = val->FindMember("kind"); > Add DHCECK (type_itr != val->MemberEnd())? (multiple places below too) Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@164 PS6, Line 164: > same (multiple places below) Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@289 PS6, Line 289: > why this? I am moving this away because SummaryStatsCounter and TimeSeriesCounter are all separate in our output JSON schema. “counters”: [ { “counter_name”: xxx, “value”: xxx, “kind”: xxx, “unit”: xxx, “child_counters”: [{...}]// same structure as counter … //type special values },{...} ], “summary_stats_counters” : [ { “counter_name”: xxx, “value”: xxx, “unit”: xxx, “min”: xxx, “max”: xxx “avg”: xxx “num_of_samples”: xxx },{...} ] I feel like it is a redundant content if we add its "kind". What do you think? I can always add it back if you feel that's better. http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@403 PS6, Line 403: > move to cc file? Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-counters.h@421 PS6, Line 421: EventList events_; > instantiate value = ...directly in L407? I am not sure what to do with this... I am using the same schema as metric collection https://github.com/apache/impala/blob/95a1da2d32ea7b28585ad574b22c2bb9dd921029/be/src/util/collection-metrics.cc#L28 http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc File be/src/util/runtime-profile-test.cc: http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1173 PS6, Line 1173: > check other units too and assert the unit data? Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1187 PS6, Line 1187: // Serialize to json > some tests for info_strings and event_sequences too? Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile-test.cc@1202 PS6, Line 1202: // Check counter value matches > check other TimeSeriesCounter implementations too? Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@705 PS6, Line 705: ); > remove rapidjson classifiers with using... Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@713 PS6, Line 713: CounterM > use d->GetAllocator() (avoid temp variable) Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@735 PS6, Line 735: } > formatting Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@1590 PS6, Line 1590: lock_guard lock(lock_); > formatting Done http://gerrit.cloudera.org:8080/#/c/13801/6/be/src/util/runtime-profile.cc@1600 PS6, Line 1600: > formatting Done http://gerrit.cloudera.org:8080/#/c/13801/6/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/13801/6/tests/hs2/test_hs2.py@659 PS6, Line 659: get_profile_req.format = 3 # json format > why not do this in the above test after L654? Yeah, I can put this inside the above test. http://gerrit.cloudera.org:8080/#/c/13801/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/13801/6/tests/webserver/test_web_pages.py@585 PS6, Line 585: # the query is in the file. > merge with test_download_text_profile? You can do something like Done -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Hello Bharath Vissapragada, Greg Rahn, Quanlong Huang, David Rorke, David Knupp, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13801 to look at the new patch set (#8). Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 611 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/8 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 8 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Hello Bharath Vissapragada, Greg Rahn, Quanlong Huang, David Rorke, David Knupp, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13801 to look at the new patch set (#7). Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Tests: E2E tests: tests/webserver/test_web_pages.py - test_download_profile * merge text and json format download together HS2 tests: tests/hs2/test_hs2.py - test_get_profile * add json format test BE Unit tests: be/src/util/runtime-profile-test.cc * ToJson.RuntimeProfileToJsonTest * ToJson.EmptyTest * ToJson.EventSequenceToJsonTest * ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 611 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/7 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. Patch Set 6: (1 comment) Thanks for your feedback! I also added some tests for this. http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc@979 PS2, Line 979: SQLSTATE_GENERAL_ERROR); : if (request.format == TRuntimeProfileFormat::THRIFT) { : return_val.__set_thrift_profile(thrift_profile); : } else if (request.format == TRuntimeProfileFormat::JSON) { : rapidjson::StringBuffer sb; : rapidjson::PrettyWriter writer(sb); : > should there be an equivalent here for JSON? Done. I am putting the output inside the profile, which is a string. Also, do we need to adopt this change to impala-beeswax-server? I looked at the code there but find that the function there only support text format download. -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 11 Jul 2019 00:05:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Hello Bharath Vissapragada, Greg Rahn, David Rorke, David Knupp, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13801 to look at the new patch set (#6). Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impalad profile page 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 4.Add tests for E2E test, unit test, hs2 test 5.Modify query profile page to different download option 6.Modify HS2 server to support get JSON format profile Tests: E2E tests: * tests/webserver/test_web_pages.py - test_download_json_profile HS2 tests: * tests/hs2/test_hs2.py - test_get_profile_json BE Unit tests: * be/src/util/runtime-profile-test.cc ToJson.RuntimeProfileToJsonTest ToJson.TimeSeriesCounterToJsonTest Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/CMakeLists.txt M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/ImpalaService.thrift M common/thrift/RuntimeProfile.thrift M tests/hs2/test_hs2.py M tests/webserver/test_web_pages.py M www/query_profile.tmpl 16 files changed, 525 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/6 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impala server 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 3.Add tests for E2E test 3.Modify query profile page to different download option Tests: E2E tests: * tests/webserver/test_web_pages.py - test_download_json_profile Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M tests/webserver/test_web_pages.py M www/query_profile.tmpl 12 files changed, 405 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/5 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impala server 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 3.Add tests for E2E test 3.Modify query profile page to different download option Tests: E2E tests: * tests/webserver/test_web_pages.py - test_download_json_profile Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M tests/webserver/test_web_pages.py M www/query_profile.tmpl 12 files changed, 410 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/4 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Description: Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Modification: 1.Add Json format download option in impala server 2.Add ToJson() function for RuntimeProfile, Counters, EventSequence 3.Add JSON format into QueryStateRecord 3.Add tests for E2E test 3.Modify query profile page to different download option Tests: E2E tests: * tests/webserver/test_web_pages.py - test_download_json_profile Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M tests/webserver/test_web_pages.py M www/query_profile.tmpl 12 files changed, 410 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/3 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-5973: Provide query plan in JSON format
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/11974 ) Change subject: IMPALA-5973: Provide query plan in JSON format .. Patch Set 3: Hi, I am trying to pick this up. Is there anyone who is familiar with this ticket and its process knows where should I start? Thanks! -- To view, visit http://gerrit.cloudera.org:8080/11974 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0f07e2223be58b7323e1ea2fa44b62626cdf4944 Gerrit-Change-Number: 11974 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh (320) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Pranay Singh (320) Gerrit-Comment-Date: Mon, 08 Jul 2019 21:35:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13801 ) Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Design Document: https://docs.google.com/document/d/ 15P_Lmjf1rlZUD4PZLXdUwTE8Lv_ME3NQ9Yf4MITzO2M/edit Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M www/query_profile.tmpl 11 files changed, 364 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/2 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13801 Change subject: IMPALA-5149: Provide query profile in JSON format .. IMPALA-5149: Provide query profile in JSON format Today there is a text and Thrift version of the query profile, but it would be useful to have a JSON version for portability and machine consumption. The ultimate goal is to have a Download JSON format profile link along with the other two formats. Design Document: https://docs.google.com/document/d/ 15P_Lmjf1rlZUD4PZLXdUwTE8Lv_ME3NQ9Yf4MITzO2M/edit Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 --- M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M www/query_profile.tmpl 11 files changed, 363 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/1 -- To view, visit http://gerrit.cloudera.org:8080/13801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1 Gerrit-Change-Number: 13801 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8665:Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665:Include extra info in error message when date cast fails .. IMPALA-8665:Include extra info in error message when date cast fails This change extends the error message Impala yields when casting STRING to DATE (explicitly or implicitly) fails. The new error message includes the violating string value. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Example: select cast('20' as date); ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "20" Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/8 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 8 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails This change extends the error message Impala yields when casting STRING to DATE (explicitly or implicitly) fails. The new error message includes the violating string value. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Example: select cast('2000' as date); ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000" Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/7 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails This change extends the error message Impala yields when casting STRING to DATE (explicitly or implicitly) fails. The new error message includes the violating string value. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Example: select ('2000' as date); ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000" Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/6 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails This change extends the error message Impala yields when casting STRING to DATE (explicitly or implicitly) fails. The new error message includes the violating string value. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Example: select ('2000' as Date); ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000" Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/5 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails When users cast string to date. It's hard for users to debug right now if users running millions of rows casting while casting failed without extra information. Solution: Add the failed data value into error message. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Examples: select ("2000" as Date); ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "2000" Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/4 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error
Jiawei Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13690 ) Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error .. IMPALA-8652 Illegal delimiter error in shell has unknown error Problem: When assign --output_delimiter to invalid value, the validation of the argument is done only after the query is running, ValueError is raised in DelimitedOutputFormatter and caught in _exec_stmt in shell Solution: Add --output_delimiter option check before impala-shell initialization Remove delimiter length check in DelimitedOutputFormatter Testing: tests/shell/test_shell_commandline.py passed Example: $ impala-shell.sh -B --output_delimiter '||' -q 'select 1,1,1' Illegal delimiter ||, the delimiter must be a 1-character string. Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 --- M shell/impala_shell.py M shell/shell_output.py 2 files changed, 9 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/3 -- To view, visit http://gerrit.cloudera.org:8080/13690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 Gerrit-Change-Number: 13690 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error
Jiawei Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13690 ) Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error .. IMPALA-8652 Illegal delimiter error in shell has unknown error Problem: When assign --output_delimiter to invalid value, the validation of the argument is done only after the query is running, ValueError is raised in DelimitedOutputFormatter and caught in _exec_stmt in shell Solution: Add --output_delimiter option check before impala-shell initialization Remove delimiter length check in DelimitedOutputFormatter Testing: tests/shell/test_shell_commandline.py passed Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 --- M shell/impala_shell.py M shell/shell_output.py 2 files changed, 7 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/2 -- To view, visit http://gerrit.cloudera.org:8080/13690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 Gerrit-Change-Number: 13690 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8652 Illegal delimiter error in shell has unknown error
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13690 Change subject: IMPALA-8652 Illegal delimiter error in shell has unknown error .. IMPALA-8652 Illegal delimiter error in shell has unknown error Problem: When assign --output_delimiter to invalid value, the validation of the argument is done only after the query is running, ValueError is raised in DelimitedOutputFormatter and caught in _exec_stmt in shell Solution: Add --output_delimiter option check before impala-shell initialization Remove delimiter length check in DelimitedOutputFormatter Testing: tests/shell/test_shell_commandline.py passed Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 --- M shell/impala_shell.py M shell/shell_output.py 2 files changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13690/1 -- To view, visit http://gerrit.cloudera.org:8080/13690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7ee2fccd305b104b3aff44c57659b6f14f2f4a05 Gerrit-Change-Number: 13690 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails When users cast string to date. It's hard for users to debug right now if users running millions of rows casting while casting failed without extra information. Solution: Add the failed data value into error message. Testing: changes -> date-partitioning.test & date.test query_test/test_date_queries.py test passed Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/date-partitioning.test M testdata/workloads/functional-query/queries/QueryTest/date.test 3 files changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/3 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 3 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc File be/src/exprs/cast-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@310 PS2, Line 310: DateVal CastFunctions::CastToDateVal(FunctionContext* ctx, const StringVal& val) { > Please cover this change with tests. If you grep for "String to Date parse I have already tested all the tests you mentioned. They all passed. The way tests handle this is to find a string match in the output. So without changing the old tests, this still passes. Do you want me to change the tests cases? (i.e. add the additional information for string match) http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@314 PS2, Line 314: (char *) > The preferred way of converting here is using reinterpret_cast as seen abov Done http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316 PS2, Line 316: invalidVal > Have you tried to simply pass the char* here? Is that a null terminated str It's not working because StringVal.ptr is not a null-terminated string. http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316 PS2, Line 316: c_str() > Do you need to convert the return value of Substitute() to char*? The reason why I used Substitute and .c_str() is that I am following the other SetError() examples. The following is one of the example that use SetError() function to pass a variable inside a string. To use SetError, char* is required. agg_fn_ctx_->SetError(Substitute("UDA Serialize() and Finalize() must return same pointer as input for $0 intermediate", dst_slot_desc.type().DebugString()).c_str()); -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Comment-Date: Thu, 20 Jun 2019 18:14:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails It's hard for users to debug right now if users running millions of rows casting to date while cast failed without extra information. Solution: Adding a fail data value is at least a minimum requirement. Testing: query_test/test_date_queries.py test passed Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/2 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13680 Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails It's hard for users to debug right now if users running millions of rows casting to date while cast failed without extra information. Solution: Adding a fail data value is at least a minimum requirement. Testing: query_test/test_date_queries.py test passed Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/1 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab