[Impala-ASF-CR] IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server
kis...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17713 ) Change subject: IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17713/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17713/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@684 PS4, Line 684: getMetastoreDdlLock().lock(); Can you add few tests, where these locks come into play from both MetastoreServiceHandler and CatalogOpExecutor at the same time ? Is it possible to write a deterministic test around this ? -- To view, visit http://gerrit.cloudera.org:8080/17713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60d4f3a49eb843fa8640cd21d623fd8dda770001 Gerrit-Change-Number: 17713 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Thu, 29 Jul 2021 01:58:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server.
kis...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17576 ) Change subject: IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server. .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/17576/13/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17576/13/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3003 PS13, Line 3003: if (!invalidateCacheOnDDLs_) { When will we set this to false ? invalidateCacheOnDDLs_ -- To view, visit http://gerrit.cloudera.org:8080/17576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2ad2630e2028b8ad26a6272ee766b27e0935c Gerrit-Change-Number: 17576 Gerrit-PatchSet: 13 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Thu, 29 Jul 2021 01:15:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17706 ) Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9203/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Thu, 29 Jul 2021 01:12:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
Qifan Chen has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17706 ) Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans .. IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans This patches enables min/max filtering for non-correlated subqueries that return one row. In this case, the filters are built from the results of the subqueries and the filtering target is the scan node to be qualified by one of the subqueries. Shown below is one such query that normally gets compiled into a nested loop join. The filtering limits the values from column store_sales.ss_sales_price to be within [-infinite, min(ss_wholesale_cost)]. select count(*) from store_sales where ss_sales_price <= (select min(ss_wholesale_cost) from store_sales); In the patch, the min/max filtering infrastructure is integrated with the nested loop join as follows. 1. Two new insertion methods InertForLE() and InsertForGE() are added to MinMaxFilter class hierarcy which maintain only the max or the min value respectivively in a min/max filter; 2. RuntimeContext::InsertPerCompareOp() calls one of the two new insertion methods above based on the comparison op saved in the filter descriptor; 3. NljBuilder::InsertRuntimeFilters() calls the new method in 2). Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 --- M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/nested-loop-join-builder.h M be/src/exec/nested-loop-join-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/decimal-value.h M be/src/runtime/query-state.h M be/src/runtime/runtime-filter.h M be/src/runtime/timestamp-value.h M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/Predicate.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java 24 files changed, 643 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17706/11 -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10801: Check the latest compaction Id before serving ACID table
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/17697 ) Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table .. Patch Set 10: (6 comments) http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2127 PS10, Line 2127: Get non-ACID table with writeIdList: > This text here comes as a message for the IllegalStateException thrown and Ack http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3479 PS10, Line 3479: If there is an ongoing loading task, we don't reload file metadata but wait for the :* loading task completed and return the table just loaded. > this comment is stale now that removed the loadReq logic. Ack http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@41 PS10, Line 41: Log > is this import needed? Ack http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@345 PS10, Line 345: hdfsTable_.writeLock().lock(); > why is this needed? The table locks are generally taken at CatalogOpExecuto This load fails after adding a precondition check for locking in HdfsTable.loadFileMetadataForPartitions. I suppose the lock is not taken at CatalogOpExecutor because hdfsTable_ is the internal object of icebergTable. http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/TableLoader.java File fe/src/main/java/org/apache/impala/catalog/TableLoader.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@115 PS10, Line 115: able.writeLock().lock(); > Why is this needed? Same as icebergTable. After adding a precondition check for locking in HdfsTable.loadFileMetadataForPartitions, this function would fail without taking lock. http://gerrit.cloudera.org:8080/#/c/17697/10/testdata/cluster/ranger/setup/policy_5_revised.json File testdata/cluster/ranger/setup/policy_5_revised.json: http://gerrit.cloudera.org:8080/#/c/17697/10/testdata/cluster/ranger/setup/policy_5_revised.json@8 PS10, Line 8: 5 > how is this change related? If it is not can we remove it from this patch a Since this patch bumps up cdp version and the new version of ranger would cause failure of create-load-data.sh. If I don't put this here, I cannot get a green testing. Or should I create a seperate commit to bump up cdp version? -- To view, visit http://gerrit.cloudera.org:8080/17697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e Gerrit-Change-Number: 17697 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 21:40:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10813: Invalidate external table from catalog cache for truncate table HMS api
Vihang Karajgaonkar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17705 ) Change subject: IMPALA-10813: Invalidate external table from catalog cache for truncate table HMS api .. IMPALA-10813: Invalidate external table from catalog cache for truncate table HMS api This patch is in continuation of IMPALA-10648 in which we missed invalidating external table for truncate_table api Testing: Enhanced exiting test to include truncate_table scenario Change-Id: I734c2b5f371291fef32badab9efc886b4b067e10 Reviewed-on: http://gerrit.cloudera.org:8080/17705 Tested-by: Impala Public Jenkins Reviewed-by: Vihang Karajgaonkar --- M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M tests/custom_cluster/test_metastore_service.py 2 files changed, 51 insertions(+), 9 deletions(-) Approvals: Impala Public Jenkins: Verified Vihang Karajgaonkar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/17705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I734c2b5f371291fef32badab9efc886b4b067e10 Gerrit-Change-Number: 17705 Gerrit-PatchSet: 5 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10813: Invalidate external table from catalog cache for truncate table HMS api
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17705 ) Change subject: IMPALA-10813: Invalidate external table from catalog cache for truncate table HMS api .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734c2b5f371291fef32badab9efc886b4b067e10 Gerrit-Change-Number: 17705 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 19:50:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17713 ) Change subject: IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17713/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17713/4//COMMIT_MSG@23 PS4, Line 23: Relying on existing tests since it is a small refactoring I don't think there are any existing tests which exercise these DDLs concurrently from catalogOp as well as CatalogMetastore interfaces. Can you please add them to confirm that there are no missing edge cases? -- To view, visit http://gerrit.cloudera.org:8080/17713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60d4f3a49eb843fa8640cd21d623fd8dda770001 Gerrit-Change-Number: 17713 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 19:50:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17576 ) Change subject: IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server. .. Patch Set 13: (6 comments) http://gerrit.cloudera.org:8080/#/c/17576/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17576/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2515 PS10, Line 2515: Table existingTbl = db.getTable(tblName); : if (existingTbl == null) return null; Isn't this same as previous? http://gerrit.cloudera.org:8080/#/c/17576/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2519 PS10, Line 2519: incompleteTable.setCreateEventId(existingTbl.getCreateEventId()); > @Vihang: This was probably missed in HANDLE create/drop patch. Please take Yeah, that looks correct to me. Thanks for fixing it. http://gerrit.cloudera.org:8080/#/c/17576/10/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17576/10/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@462 PS10, Line 462: : // Parse db name. Throw error if parsing fails. : String dbName; : String catName; : String[] catAndDbName = : This code seems to have a race condition, The database could have been added after line 466 is executed. http://gerrit.cloudera.org:8080/#/c/17576/13/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17576/13/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@456 PS13, Line 456: if (!invalidateCacheOnDDLs_) { : LOG.debug("invalidateCacheOnDDLs_ flag is false, skipping " + : "cache update for hms operation drop_database on db: " + : databaseName); : return; : } : : // Parse db name. Throw error if parsing fails. : String dbName; : String catName; : String[] catAndDbName = : MetaStoreUtils.parseDbName(databaseName, serverConf_); : catName = catAndDbName[0]; : dbName = catAndDbName[1]; : // get DB to drop from catalog cache : Db dbToDrop = catalog_.getDb(dbName); : if(dbToDrop == null) { : // dbToDrop doesn't exist in cache : return; : } : List events; : try { : events = MetastoreEventsProcessor : .getNextMetastoreEvents(catalog_, currentEventId, : event -> event.getEventType() : .equalsIgnoreCase(MetastoreEvents.DropDatabaseEvent : .DROP_DATABASE_EVENT_TYPE) : && catName.equalsIgnoreCase(event.getCatName()) : && dbName.equalsIgnoreCase(event.getDbName())); : } catch (ImpalaException e) { : String errorMsg = "Unable to process Drop database event for db: " + : dbToDrop.getName(); : LOG.error(errorMsg, e); : throw new MetaException(errorMsg); : } : if (events.isEmpty()) { : throw new MetaException( : "Drop database event not received. Check if notification " + : "events are configured in hive metastore"); : } : long dropEventId = events.get(events.size() - 1).getEventId(); : boolean isRemoved = : catalogOpExecutor_.removeDbIfNotAddedLater(dropEventId, : dbToDrop.getMetaStoreDb()); : if (isRemoved) { : LOG.info("Removed database: " + databaseName + : " from cache due to HMS API: drop_database"); it seems cleaner to refactor this into a separate method just like we have removeTableIfExists() below. http://gerrit.cloudera.org:8080/#/c/17576/13/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@2931 PS13, Line 2931: invalidateNonTransactionalTabl
[Impala-ASF-CR] IMPALA-10801: Check the latest compaction Id before serving ACID table
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17697 ) Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table .. Patch Set 10: (7 comments) http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2127 PS10, Line 2127: Get non-ACID table with writeIdList: This text here comes as a message for the IllegalStateException thrown and is user-visible. The message should be easy to understand for the user. I would recommend writing something like "Compaction id check cannot be done for non-transaction table + tbl.getFullName()" http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2131 PS10, Line 2131: getPartitionsForRefreshingFileMetadata It looks like we are fetching the same information twice from the metastore. Once when we want to detect if there is a compaction on any of the partitions of the table, second when we are actually updating the compactionId of the partition in the refreshFileMetadata method below. Can we avoid the repeated work? May be you can reuse the compactionIds which HMS returned in by setting them in partsToBeRefreshed so that refreshFileMetadata method below reuses them. That way you can get rid of getAndSetLastCompactionId() http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3479 PS10, Line 3479: If there is an ongoing loading task, we don't reload file metadata but wait for the :* loading task completed and return the table just loaded. this comment is stale now that removed the loadReq logic. http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@41 PS10, Line 41: Log is this import needed? http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@345 PS10, Line 345: hdfsTable_.writeLock().lock(); why is this needed? The table locks are generally taken at CatalogOpExecutor level. e.g see callers of loadTableMetadata() in CatalogOpExecutor http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/TableLoader.java File fe/src/main/java/org/apache/impala/catalog/TableLoader.java: http://gerrit.cloudera.org:8080/#/c/17697/10/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@115 PS10, Line 115: able.writeLock().lock(); Why is this needed? http://gerrit.cloudera.org:8080/#/c/17697/10/testdata/cluster/ranger/setup/policy_5_revised.json File testdata/cluster/ranger/setup/policy_5_revised.json: http://gerrit.cloudera.org:8080/#/c/17697/10/testdata/cluster/ranger/setup/policy_5_revised.json@8 PS10, Line 8: 5 how is this change related? If it is not can we remove it from this patch and create a separate commit for it? -- To view, visit http://gerrit.cloudera.org:8080/17697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e Gerrit-Change-Number: 17697 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 19:39:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10801: Check the latest compaction Id before serving ACID table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17697 ) Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9202/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e Gerrit-Change-Number: 17697 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 17:56:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10801: Check the latest compaction Id before serving ACID table
Yu-Wen Lai has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17697 ) Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table .. IMPALA-10801: Check the latest compaction Id before serving ACID table Since compactions don't advance write id, we don't know if a table/partition is compacted by comparing writeIdList. A possible issue is that CatalogD provides obsolete file metadata and causes a runtime error. In order to fix this issue, we introduced a HMS API that can get the latest compaction record for a table/partition (HIVE-24828). In CatalogD, we cache compaction id while loading partitions and compare the cached id with the latest compaction id before serving. If there is a newer compaction happened, it would refresh the file metadata. Besides, this patch also change how to replace the existing table after a table full reloading. The current way is to replace the table if the catalog version is not changed. For transactional tables, things get additional complexity given that file metadata refreshing and full table reloading can happen together. We can actually use writeIdList to determine whether we should replace the table for transactional tables. As long as the updated table has more recent writeIdList than the existing one, we are safe to replace the table. For Non-transactional tables, we still keep original behavior. Testing: - Add a test in PartialCatalogInfoWriteIdTest Change-Id: I86a112a77980fef7f6238978bc9668a65262101e --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java M testdata/bin/create-load-data.sh R testdata/cluster/ranger/setup/policy_5_revised.json 12 files changed, 387 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/17697/10 -- To view, visit http://gerrit.cloudera.org:8080/17697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e Gerrit-Change-Number: 17697 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] WiP: IMPALA-9495: Support struct in select list for ORC tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/17638 ) Change subject: WiP: IMPALA-9495: Support struct in select list for ORC tables .. Patch Set 3: (8 comments) I can't say I completely understand everything, I've only found some nit. http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/runtime/buffered-tuple-stream.cc@85 PS3, Line 85: continue Nit: Maybe it would be nicer to use ```if (!condition) {then}``` instead of ```if (condition) continue; then```. Also applies on L89. http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/runtime/types.h File be/src/runtime/types.h: http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/runtime/types.h@264 PS3, Line 264: inline int GetSlotSize(const ColumnType& col_type) const { Nit: this could be static. http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/runtime/types.h@288 PS3, Line 288: inline int GetByteSize(const ColumnType& col_type) const { Nit: this could be static. http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/17638/3/be/src/udf/udf.h@747 PS3, Line 747: uint8_t** ptr; Would using std::vector not be compatible? Or too much overhead? http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java: http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@248 PS3, Line 248: // For scalar types, the materialized path is the same as path_ Shouldn't we include struct types in the comment? http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@164 PS3, Line 164: expect Typo: expects http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@418 PS3, Line 418: scalar Is it still true or is it a scalar OR a struct? http://gerrit.cloudera.org:8080/#/c/17638/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@978 PS3, Line 978: support Nit: are supported -- To view, visit http://gerrit.cloudera.org:8080/17638 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a Gerrit-Change-Number: 17638 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 28 Jul 2021 15:28:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17726 ) Change subject: IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI .. IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI This patch appends the username of the client who made the request to close a session or cancel a query from the coordinator's debug WebUI. Tests: - Added a new fe test for LDAP auth to verify that the new status gets printed in runtime profile and coordinator log when a query is cancelled in this way. Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Reviewed-on: http://gerrit.cloudera.org:8080/17726 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/kudu/util/web_callback_registry.h M be/src/service/impala-http-handler.cc M be/src/util/webserver.cc M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java M tests/webserver/test_web_pages.py 6 files changed, 96 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Gerrit-Change-Number: 17726 Gerrit-PatchSet: 7 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17726 ) Change subject: IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Gerrit-Change-Number: 17726 Gerrit-PatchSet: 6 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 28 Jul 2021 13:18:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10825: fix impalad crashes when closing the retrying query
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17729 ) Change subject: IMPALA-10825: fix impalad crashes when closing the retrying query .. Patch Set 4: Code-Review+1 (2 comments) Looks great! http://gerrit.cloudera.org:8080/#/c/17729/2/tests/custom_cluster/test_query_retries.py File tests/custom_cluster/test_query_retries.py: http://gerrit.cloudera.org:8080/#/c/17729/2/tests/custom_cluster/test_query_retries.py@760 PS2, Line 760: select count(*) from tpch_parquet.lineitem > Not need. We use the debug action to delay and have enough time to do close Done http://gerrit.cloudera.org:8080/#/c/17729/2/tests/custom_cluster/test_query_retries.py@773 PS2, Line 773: > This bug only crashes the coordinator, so check the first one is enough Done -- To view, visit http://gerrit.cloudera.org:8080/17729 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a2cccbfbdca00b099e0f8d5f2d4bcb4d0a8c3 Gerrit-Change-Number: 17729 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 13:17:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 6: Code-Review+1 Looks great! -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 6 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 13:16:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17576 ) Change subject: IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server. .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2ad2630e2028b8ad26a6272ee766b27e0935c Gerrit-Change-Number: 17576 Gerrit-PatchSet: 13 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 12:29:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17713 ) Change subject: IMPALA-10817: Share metastoreHmsDDL lock b/w CatalogOpExecutor and Catalog metastore server .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60d4f3a49eb843fa8640cd21d623fd8dda770001 Gerrit-Change-Number: 17713 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 28 Jul 2021 11:59:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9201/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 6 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:47:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9200/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:50:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9199/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:32:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Xianqing He has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined Creating a single node plan for the following SQL sometime can slowdown, with about hundreds of inlineviews to join, and view1, view2... outputs hundreds of expressions. select c1 from (select c1, id from view1 where c1 > 10) t1 join (select c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ... The reasons for the slow generation of plans are as follows 1. Many auxiliary predicates are added to GlobalState.conjuncts causing performance degradation of Analyzer#getUnassignedConjuncts 2. In SingleNodePlanner#createInlineViewPlan the output smap is the composition of the inline view's smap and the output smap of the inline view's plan root. Multiple inline view joins cause ExprSubstitutionMap#compose performance to degrade. For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered conjuncts without auxiliary predicate. For 2, remove expressions from outputSmap that are not used according to baseSmap. Testing: Add test tests/query_test/test_query_compilation.py Repro query created single node plan went from 2.3 sec to 0.3 sec. Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A tests/query_test/test_query_compilation.py 4 files changed, 72 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/6 -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 6 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/17712/5/tests/query_test/test_query_compilation.py File tests/query_test/test_query_compilation.py: http://gerrit.cloudera.org:8080/#/c/17712/5/tests/query_test/test_query_compilation.py@21 PS5, Line 21: class TestSingleNodePlanCreated(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:29:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Xianqing He has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined Creating a single node plan for the following SQL sometime can slowdown, with about hundreds of inlineviews to join, and view1, view2... outputs hundreds of expressions. select c1 from (select c1, id from view1 where c1 > 10) t1 join (select c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ... The reasons for the slow generation of plans are as follows 1. Many auxiliary predicates are added to GlobalState.conjuncts causing performance degradation of Analyzer#getUnassignedConjuncts 2. In SingleNodePlanner#createInlineViewPlan the output smap is the composition of the inline view's smap and the output smap of the inline view's plan root. Multiple inline view joins cause ExprSubstitutionMap#compose performance to degrade. For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered conjuncts without auxiliary predicate. For 2, remove expressions from outputSmap that are not used according to baseSmap. Testing: Add test tests/query_test/test_query_compilation.py Repro query created single node plan went from 2.3 sec to 0.3 sec. Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A tests/query_test/test_query_compilation.py 4 files changed, 72 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/5 -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py File tests/query_test/test_query_compilation.py: http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@18 PS4, Line 18: import pytest flake8: F401 'pytest' imported but unused http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@21 PS4, Line 21: class TestSingleNodePlanCreated(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/17712/4/tests/query_test/test_query_compilation.py@38 PS4, Line 38: flake8: W292 no newline at end of file -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:10:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Xianqing He has posted comments on this change. ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17712/2//COMMIT_MSG@29 PS2, Line 29: Testing: > nit. I wonder if the query mentioned in the commit message can be used as t Done -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 28 Jul 2021 10:10:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined
Xianqing He has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17712 ) Change subject: IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined .. IMPALA-10806: Create single node plan slowdown when hundreds of inline views are joined Creating a single node plan for the following SQL sometime can slowdown, with about hundreds of inlineviews to join, and view1, view2... outputs hundreds of expressions. select c1 from (select c1, id from view1 where c1 > 10) t1 join (select c2, id from view2 where c1 > 10) t2 on t1.id = t2.id join ... The reasons for the slow generation of plans are as follows 1. Many auxiliary predicates are added to GlobalState.conjuncts causing performance degradation of Analyzer#getUnassignedConjuncts 2. In SingleNodePlanner#createInlineViewPlan the output smap is the composition of the inline view's smap and the output smap of the inline view's plan root. Multiple inline view joins cause ExprSubstitutionMap#compose performance to degrade. For 1, add GlobalState.conjunctsWithoutAuxExpr to save the registered conjuncts without auxiliary predicate. For 2, remove expressions from outputSmap that are not used according to baseSmap. Testing: Add test tests/query_test/test_query_compilation.py Repro query created single node plan went from 2.3 sec to 0.3 sec. Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A tests/query_test/test_query_compilation.py 4 files changed, 70 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17712/4 -- To view, visit http://gerrit.cloudera.org:8080/17712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb4011b6167a0e61438a73c4dba6f1cd0a4e8c6a Gerrit-Change-Number: 17712 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He
[Impala-ASF-CR] IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17726 ) Change subject: IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Gerrit-Change-Number: 17726 Gerrit-PatchSet: 6 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 28 Jul 2021 07:00:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17726 ) Change subject: IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7357/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Gerrit-Change-Number: 17726 Gerrit-PatchSet: 6 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 28 Jul 2021 07:00:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17726 ) Change subject: IMPALA-10779: Print the username closing a session or cancelling a query from the WebUI .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02c92b5caee61d1f9f381cd2906a850e02c54d55 Gerrit-Change-Number: 17726 Gerrit-PatchSet: 5 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 28 Jul 2021 07:00:11 + Gerrit-HasComments: No