[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got Metastore client: 166.339us (2.362us)
   - Got current Metastore event id 8355270: 26s494ms (26s494ms)
   - Created table in Metastore: 26s558ms (63.507ms)
   - Fetched event batch from Metastore: 26s559ms (1.155ms)
   - Created table in catalog cache: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

An example of creating a Kudu table:
Catalog Server Operation: 1s730ms
   - Got Metastore client: 113.403us (113.403us)
   - Got current Metastore event id 8355276: 974.500us (861.097us)
   - Got Kudu client: 212.123ms (211.148ms)
   - Got kuduDdlLock: 212.128ms (4.680us)
   - Checked table existence in Kudu: 850.786ms (638.658ms)
   - Created table in Kudu: 1s623ms (772.379ms)
   - Got metastoreDdlLock: 1s623ms (397.305us)
   - Got Metastore client: 1s623ms (7.813us)
   - Checked table existence in Metastore: 1s648ms (25.154ms)
   - Created table in Metastore: 1s725ms (76.348ms)
   - Fetched event batch from Metastore: 1s728ms (3.004ms)
   - Created table in catalog cache: 1s730ms (2.024ms)
   - DDL finished: 1s730ms (84.448us)

An example of creating an Iceberg table:
Catalog Server Operation: 1s573ms
   - Got Metastore client: 141.799us (141.799us)
   - Checked table existence in Metastore: 2.957ms (2.815ms)
   - Got current Metastore event id 8355277: 3.669ms (712.475us)
   - Created table using Iceberg Catalog HIVE_CATALOG: 1s379ms (1s375ms)
   - Fetched event batch from Metastore: 1s381ms (2.188ms)
   - Created table in catalog cache: 1s382ms (1.556ms)
   - Set Iceberg table owner in Metastore: 1s573ms (190.262ms)
   - DDL finished: 1s573ms (59.176us)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Reviewed-on: http://gerrit.cloudera.org:8080/20368
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/Catalog

[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 30 Aug 2023 04:28:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9649/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 30 Aug 2023 00:02:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 30 Aug 2023 00:02:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 5: Code-Review+2

(1 comment)

This looks good to me

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3738
PS5, Line 3738: Preconditions.checkState(tableName.isFullyQualified());
> TableName.fromThrift() always return a non-null result:
That makes sense



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 29 Aug 2023 23:44:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 5:

(1 comment)

> Patch Set 5:
>
> (1 comment)
>
> This is very useful! Do we have plans to extend this to drop table, create 
> database, drop database, etc?

Yeah, this is the first step. I will extend this to all DDLs after this patch 
is merged.

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3738
PS5, Line 3738: Preconditions.checkState(tableName.isFullyQualified());
> Nit: Why did we remove the tableName != null part?
TableName.fromThrift() always return a non-null result:
https://github.com/apache/impala/blob/0c8fc997ef7df09b675180a7baa1482852d60b11/fe/src/main/java/org/apache/impala/analysis/TableName.java#L150

  public static TableName fromThrift(TTableName tableName) {
return new TableName(tableName.getDb_name(), tableName.getTable_name());
  }



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 29 Aug 2023 23:42:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-29 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 5:

(1 comment)

This is very useful! Do we have plans to extend this to drop table, create 
database, drop database, etc?

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3738
PS5, Line 3738: Preconditions.checkState(tableName.isFullyQualified());
Nit: Why did we remove the tableName != null part?



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 29 Aug 2023 17:56:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13820/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 23 Aug 2023 08:29:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-23 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 5:

Added tests for creating kudu/iceberg tables


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 23 Aug 2023 08:08:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-23 Thread Quanlong Huang (Code Review)
Hello Riza Suminto, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20368

to look at the new patch set (#5).

Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got Metastore client: 166.339us (2.362us)
   - Got current Metastore event id 8355270: 26s494ms (26s494ms)
   - Created table in Metastore: 26s558ms (63.507ms)
   - Fetched event batch from Metastore: 26s559ms (1.155ms)
   - Created table in catalog cache: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

An example of creating a Kudu table:
Catalog Server Operation: 1s730ms
   - Got Metastore client: 113.403us (113.403us)
   - Got current Metastore event id 8355276: 974.500us (861.097us)
   - Got Kudu client: 212.123ms (211.148ms)
   - Got kuduDdlLock: 212.128ms (4.680us)
   - Checked table existence in Kudu: 850.786ms (638.658ms)
   - Created table in Kudu: 1s623ms (772.379ms)
   - Got metastoreDdlLock: 1s623ms (397.305us)
   - Got Metastore client: 1s623ms (7.813us)
   - Checked table existence in Metastore: 1s648ms (25.154ms)
   - Created table in Metastore: 1s725ms (76.348ms)
   - Fetched event batch from Metastore: 1s728ms (3.004ms)
   - Created table in catalog cache: 1s730ms (2.024ms)
   - DDL finished: 1s730ms (84.448us)

An example of creating an Iceberg table:
Catalog Server Operation: 1s573ms
   - Got Metastore client: 141.799us (141.799us)
   - Checked table existence in Metastore: 2.957ms (2.815ms)
   - Got current Metastore event id 8355277: 3.669ms (712.475us)
   - Created table using Iceberg Catalog HIVE_CATALOG: 1s379ms (1s375ms)
   - Fetched event batch from Metastore: 1s381ms (2.188ms)
   - Created table in catalog cache: 1s382ms (1.556ms)
   - Set Iceberg table owner in Metastore: 1s573ms (190.262ms)
   - DDL finished: 1s573ms (59.176us)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apach

[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 4:

(1 comment)

Thanks for the review!

http://gerrit.cloudera.org:8080/#/c/20368/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20368/4//COMMIT_MSG@33
PS4, Line 33: Created table in Metastore
> nit: curious, why no existence check in Metastore before this?
We can add that step in a separated patch. Currently we rely on the createTable 
HMS RPC to fail. The error message is not quite user-friendly. I disable 
event-processor and create a table in Hive, then create it again in Impala:

Query: create table hive_tbl(id int)
ERROR: ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
CAUSED BY: AlreadyExistsException: Table hive.default.hive_tbl already exists

When creating the same table name in kudu format:

 Query: create table hive_tbl (id int, name string, primary key(id)) partition 
by hash(id) partitions 3 stored as kudu
+---+
| summary   |
+---+
| Table already exists. |
+---+
Fetched 1 row(s) in 1.63s

We can also improve the error message to "Table already exists in Metastore". 
Filed a newbie JIRA: IMPALA-12396



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 23 Aug 2023 02:42:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-22 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 4: Code-Review+1

(1 comment)

Thank you for addressing my comments!

http://gerrit.cloudera.org:8080/#/c/20368/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20368/4//COMMIT_MSG@33
PS4, Line 33: Created table in Metastore
nit: curious, why no existence check in Metastore before this?



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 22 Aug 2023 20:19:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13797/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 22 Aug 2023 03:19:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 3:

(12 comments)

> Patch Set 3:
>
> (12 comments)
>
> I think it is important to make the event label self explanatory and as clear 
> as possible. Otherwise, an explanation should be documented somewhere for 
> each label.
>
> Some general comments:
> - Replace HMS with Metastore for consistency.
> - Consider turning some duplicated event label into string constant.
> - Clarify the source of event. Is it Metastore vs CatalogD vs Filesystem vs 
> other service (like Kudu master).

Thanks for the suggestions!

Refactored the patch to use string constants. I was intended to write the 
labels differently by strings to get rid of multiple hits when searching the 
code. But it makes sense to normalize them as constants so we don't need to 
document them.

Also clarified the sources. BTW, use "Kudu" instead of "Kudu master" since we 
treate Kudu as a system and can't distinguish what are created in Kudu master 
and what in Kudu tablet servers.

I plan to add more tests but need to figure out a clean way of verifying the 
EventSequence. Found an existing bug of the JSON profile by the way: 
IMPALA-12391

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3523
PS3, Line 3523: catalogTimeline.markEvent("Created Metastore 
table");
  :   } else {
  : catalogTimeline.markEvent("Found table exists");
> Maybe reorganize this to 2 things: existence check and new table creation:
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3552
PS3, Line 3552: Created catalog table
> nit: "Cached table in catalogd"?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3610
PS3, Line 3610: Created Metastore table
> nit: "Created table in Metastore"?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3627
PS3, Line 3627: HMS
> nit: "Fetched table from Metastore"?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3656
PS3, Line 3656: Created catalog table
> nit: "Cached table in catalog"?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3702
PS3, Line 3702: HMS
> nit: Metastore
Yeah, this one is different than other fetches since it just use the createTime 
and not caching other things.


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3786
PS3, Line 3786: Created iceberg table
> nit: created in where? HMS, filesystem, or CatalogD cache?
It depends on what iceberg catalog is used. Hard to say it's just HMS or 
filesystem..
This label generally means the invocation on iceberg createTable API finished.


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3827
PS3, Line 3827: Created Metastore table
> nit: dulicated String. Change to constant?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3834
PS3, Line 3834: Found table exists
> nit: dulicated String. Change to constant?
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@102
PS3, Line 102: Checked table exists
> nit: Checked table existence
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@342
PS3, Line 342: Checked table exists
> nit: Checked table existence
Done


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@369
PS3, Line 369: cols
> nit: columns
Done



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 22 Aug 2023 02:54:37 +
Gerrit-Ha

[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Quanlong Huang (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20368

to look at the new patch set (#4).

Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got Metastore client: 166.339us (2.362us)
   - Got current Metastore event id 8355270: 26s494ms (26s494ms)
   - Created table in Metastore: 26s558ms (63.507ms)
   - Fetched event batch from Metastore: 26s559ms (1.155ms)
   - Created table in catalog cache: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

An example of creating a Kudu table:
Catalog Server Operation: 1s730ms
   - Got Metastore client: 113.403us (113.403us)
   - Got current Metastore event id 8355276: 974.500us (861.097us)
   - Got Kudu client: 212.123ms (211.148ms)
   - Got kuduDdlLock: 212.128ms (4.680us)
   - Checked table existence in Kudu: 850.786ms (638.658ms)
   - Created table in Kudu: 1s623ms (772.379ms)
   - Got metastoreDdlLock: 1s623ms (397.305us)
   - Got Metastore client: 1s623ms (7.813us)
   - Checked table existence in Metastore: 1s648ms (25.154ms)
   - Created table in Metastore: 1s725ms (76.348ms)
   - Fetched event batch from Metastore: 1s728ms (3.004ms)
   - Created table in catalog cache: 1s730ms (2.024ms)
   - DDL finished: 1s730ms (84.448us)

An example of creating an Iceberg table:
Catalog Server Operation: 1s573ms
   - Got Metastore client: 141.799us (141.799us)
   - Checked table existence in Metastore: 2.957ms (2.815ms)
   - Got current Metastore event id 8355277: 3.669ms (712.475us)
   - Created table using Iceberg Catalog HIVE_CATALOG: 1s379ms (1s375ms)
   - Fetched event batch from Metastore: 1s381ms (2.188ms)
   - Created table in catalog cache: 1s382ms (1.556ms)
   - Set Iceberg table owner in Metastore: 1s573ms (190.262ms)
   - DDL finished: 1s573ms (59.176us)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/servic

[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 3:

(12 comments)

I think it is important to make the event label self explanatory and as clear 
as possible. Otherwise, an explanation should be documented somewhere for each 
label.

Some general comments:
- Replace HMS with Metastore for consistency.
- Consider turning some duplicated event label into string constant.
- Clarify the source of event. Is it Metastore vs CatalogD vs Filesystem vs 
other service (like Kudu master).

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3523
PS3, Line 3523: catalogTimeline.markEvent("Created Metastore 
table");
  :   } else {
  : catalogTimeline.markEvent("Found table exists");
Maybe reorganize this to 2 things: existence check and new table creation:

- Checked table existence in Metastore: xx ms
- Created table in Metastore: xx ms

The first one is always printed. The second one only show up in new table 
creation.


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3552
PS3, Line 3552: Created catalog table
nit: "Cached table in catalogd"?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3610
PS3, Line 3610: Created Metastore table
nit: "Created table in Metastore"?
This is referred twice in chis class. Should this be a constant instead?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3627
PS3, Line 3627: HMS
nit: "Fetched table from Metastore"?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3656
PS3, Line 3656: Created catalog table
nit: "Cached table in catalog"?
This is referred twice in chis class. Should this be a constant instead?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3702
PS3, Line 3702: HMS
nit: Metastore
Is this label intentionally made different from the other "HMS table fetched" 
at line 3627?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3786
PS3, Line 3786: Created iceberg table
nit: created in where? HMS, filesystem, or CatalogD cache?
Why iceberg has its own specific label?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3827
PS3, Line 3827: Created Metastore table
nit: dulicated String. Change to constant?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3834
PS3, Line 3834: Found table exists
nit: dulicated String. Change to constant?
Found in CatalogD or Metastore?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@102
PS3, Line 102: Checked table exists
nit: Checked table existence
Exist in where? Kudu master, HMS, or CatalogD?


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@342
PS3, Line 342: Checked table exists
nit: Checked table existence


http://gerrit.cloudera.org:8080/#/c/20368/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@369
PS3, Line 369: cols
nit: columns



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 21 Aug 2023 17:18:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13789/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 21 Aug 2023 14:11:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@455
PS2, Line 455:   createTable(ddlRequest.getCreate_table_params(), 
response, catalogTimeline,
> line too long (94 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2072
PS2, Line 2072:   private List 
getNextMetastoreEventsIfEnabled(
> line too long (96 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4047
PS2, Line 4047:   createKuduTable(tbl, createTableParams, 
wantMinimalResult, response,
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 21 Aug 2023 13:43:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-21 Thread Quanlong Huang (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20368

to look at the new patch set (#3).

Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got Metastore client: 166.339us (2.362us)
   - Got current Metastore event id: 26s494ms (26s494ms)
   - Created Metastore table: 26s558ms (63.507ms)
   - Fetched Metastore event batch: 26s559ms (1.155ms)
   - Created catalog table: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M tests/query_test/test_observability.py
9 files changed, 193 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/20368/3
--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13784/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sun, 20 Aug 2023 14:12:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-20 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 1:

(3 comments)

> Patch Set 1:
>
> (3 comments)
>
> It looks OK to me.
>
> However, there are so many new event labels introduced here and it feels very 
> verbose. Some label can appear in one condition and missing in other 
> condition. I wonder if it can be reduced into smaller set that can be 
> displayed more consistently.
>
> Any thoughts on changing some of the minor one into LOG or counter? Or 
> perhaps write it to coordinator log instead of query profile? I understand 
> this patch is meant to gather CatalogD timing info in one place.

Yeah, a lot of event labels are introduced in this patch. I try to add event 
label for each external RPC and lock acquisition since they are usually the 
bottleneck of catalog operations.
For switching some events to be counters or logs, I think there are some cons. 
For counters:
* We already have lots of counters in the profile. I need to search them.
* Counters always exist in the profile even if the code branch doesn't run.
* Users don't know which step happens first if we just provide counters. 
Timeline events have the order which also helps users/beginers to learn Impala.

For logs:
* We need to manually compute the elapsed time between two steps unless we 
explicitly log them.
* We need to collect catalogd and coordinator logs for these details events 
when the profile can't explain the issue. Logs are easier to be rolled out. So 
it'd be nice if we can provide the execution details in one place. Users just 
need to provide us the profile.

BTW, current DDL profiles are usually short (except CTAS) so I think it's ok to 
add detailed timeline in them :)

http://gerrit.cloudera.org:8080/#/c/20368/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20368/1//COMMIT_MSG@31
PS1, Line 31: HMS
> nit: maybe spell it out to "Metastore"?
Done. Changed "HMS" to "Metastore" in this timeline.


http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@400
PS1, Line 400: CatalogOpExecution
> nit: Is this the same as "Catalog Server Operation"?
This string is not used since I hard-coded the name in 
ClientRequestState::ExecDdlRequestImpl(). Changed the code to use this and 
change this to "Catalog Server Operation".


http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2255
PS1, Line 2255: timeline
> nit: how about using more specific name like catalogEvents or catalogTimeli
Done



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sun, 20 Aug 2023 13:47:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-20 Thread Quanlong Huang (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20368

to look at the new patch set (#2).

Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got Metastore client: 166.339us (2.362us)
   - Got current Metastore event id: 26s494ms (26s494ms)
   - Created Metastore table: 26s558ms (63.507ms)
   - Fetched Metastore event batch: 26s559ms (1.155ms)
   - Created catalog table: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M tests/query_test/test_observability.py
9 files changed, 191 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/20368/2
--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@455
PS2, Line 455:   createTable(ddlRequest.getCreate_table_params(), 
response, catalogTimeline, syncDdl,
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2072
PS2, Line 2072:   private List 
getNextMetastoreEventsIfEnabled(EventSequence catalogTimeline,
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/20368/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4047
PS2, Line 4047:   createKuduTable(tbl, createTableParams, 
wantMinimalResult, response, catalogTimeline);
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sun, 20 Aug 2023 13:47:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 1:

(3 comments)

It looks OK to me.

However, there are so many new event labels introduced here and it feels very 
verbose. Some label can appear in one condition and missing in other condition. 
I wonder if it can be reduced into smaller set that can be displayed more 
consistently.

Any thoughts on changing some of the minor one into LOG or counter? Or perhaps 
write it to coordinator log instead of query profile? I understand this patch 
is meant to gather CatalogD timing info in one place.

http://gerrit.cloudera.org:8080/#/c/20368/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20368/1//COMMIT_MSG@31
PS1, Line 31: HMS
nit: maybe spell it out to "Metastore"?


http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@400
PS1, Line 400: CatalogOpExecution
nit: Is this the same as "Catalog Server Operation"?


http://gerrit.cloudera.org:8080/#/c/20368/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2255
PS1, Line 2255: timeline
nit: how about using more specific name like catalogEvents or catalogTimeline? 
Same comment for the other places.



--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 17 Aug 2023 23:45:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 16 Aug 2023 12:00:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9602/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 16 Aug 2023 07:52:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20368 )

Change subject: IMPALA-12024: Add catalog profile for createTable
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13762/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 16 Aug 2023 07:37:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12024: Add catalog profile for createTable

2023-08-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20368


Change subject: IMPALA-12024: Add catalog profile for createTable
..

IMPALA-12024: Add catalog profile for createTable

DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.

As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.

To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.

Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
   - Got metastoreDdlLock: 163.977us (163.977us)
   - Got HMS client: 166.339us (2.362us)
   - Got current HMS event id: 26s494ms (26s494ms)
   - Created HMS table: 26s558ms (63.507ms)
   - Fetched HMS event batch: 26s559ms (1.155ms)
   - Created catalog table: 26s560ms (1.164ms)
   - DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
   - Metadata of all 1 tables cached: 10.535ms (10.535ms)
   - Analysis finished: 118.324ms (107.788ms)
   - Authorization finished (noop): 118.489ms (164.626us)
   - Value transfer graph computed: 118.830ms (341.792us)
   - Single node plan created: 150.150ms (31.319ms)
   - Runtime filters computed: 150.254ms (103.529us)
   - Distributed plan created: 151.832ms (1.578ms)
   - Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
   - Query submitted: 129.658us (129.658us)
   - Planning finished: 170.224ms (170.095ms)
   - CatalogDdlRequest finished: 26s731ms (26s561ms)
   - Applied catalog updates from DDL: 26s740ms (8.752ms)
   - Submit for admission: 26s740ms (22.233us)
   - Completed admission: 26s740ms (286.295us)
   - Ready to start on 3 backends: 26s740ms (155.916us)
   - All 3 execution backends (3 fragment instances) started: 26s751ms 
(10.864ms)
   - Last row fetched: 26s920ms (168.226ms)
   - Released admission control resources: 26s920ms (27.635us)
   - DML data written: 26s920ms (126.369us)
   - Applied catalog updates from DDL: 26s985ms (65.354ms)
   - DML Metastore update finished: 26s985ms (30.343us)
   - Rows available: 26s985ms (27.050us)
   - Unregister query: 27s304ms (318.661ms)

Tests:
 - Add e2e tests to verify the DDL timeline events exist in profile

Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M tests/query_test/test_observability.py
9 files changed, 182 insertions(+), 68 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/20368/1
--
To view, visit http://gerrit.cloudera.org:8080/20368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Gerrit-Change-Number: 20368
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang