[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 24: (3 comments) nits/clarifications, otherwise lgtm. http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1017 PS24, Line 1017:* the privilege, it will be removed on the next refresh. ALTER DATABASE SET OWNER might be worthwhile to state that while hms state will correctly reflect the owner and the catalogd catalog will reflect the hms state, since sentry is updated async by hms, sentry's state may not reflect the state in hms. when an out-of-date syntry updates the catalogd, it will use sentries incorrect state. consequently, the state in catalogd may not always match the state in hms. there will be some period that we could wait for sentry and hms to be in sync. perhaps we wait for their state to settle. if it doesn't, we roll-back the hms state? not sure if hms would give you the xact to effectively roll-back. I'm fine with not doing this now, but if its something that may be of interest, perhaps a todo is useful. http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/test/resources/sentry-site.xml.template File fe/src/test/resources/sentry-site.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/test/resources/sentry-site.xml.template@81 PS24, Line 81: nit: remove the spacing http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156 PS23, Line 156: > Can't use that since I want to create the database after the cluster starts create table if not exists 'foo' ... ? -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 24 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 06:18:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225 PS4, Line 225: > We don't know how much memory is released until a GC. ok, I was looking at our own memory estimate of tables (it gets computed when serializing to thrift-- that will not happen with v2) as a proxy for how much we expect to reclaim. no need to change this for this round. http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@148 PS7, Line 148: fe_.getImpaladTableUsageTracker().recordTableUsage(loadedTbls_.keySet()); nit: formatting http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175 PS4, Line 175: icted because o > The call site of this function is guarded with a try clause so I think it's if we know it can be hit, is the main reason not to guard it for readability? http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java: http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@76 PS7, Line 76:* invalidation is disabled. nit: , but in that case, it will be a no-op. http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py File tests/custom_cluster/test_automatic_invalidation.py: http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@35 PS7, Line 35: query = "select count(*) from functional.alltypes" add a brief description of what each test is trying to do. I expected that the same test is applied to v1 and v2 but was surprised to see two fairly different approaches when validating the state of catalogd (which is the same for both)-- why the difference? http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@57 PS7, Line 57: mns (list) = list
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11227 ) Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. Patch Set 2: Looks like the failed test is related to this patch. Will wait for it to be fixed before +2'ing -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 05:45:00 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (7 comments) Some high-level comments. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@26 PS3, Line 26: Can you add a sample output here. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32 PS3, Line 32: breaking down by category? > still hoping to get some feedback on the above Yep, just looking at the code, I have a feeling it could be chatty and most users don't probably need that information. But if you can paste some sample profiles somewhere (especially with a mix of cache hits + misses, with/without partitions, multiple partial RPCs etc.) that'd be great. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33 PS3, Line 33: - worth trying to include the byte sizes of metadata fetched? Could be useful from incremental stats / large partitioned tables. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309 PS3, Line 309: ret = FeSupport.GetPartialCatalogObject(new TSerializer().serialize(req)); Should we track num rpcs? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349 PS3, Line 349: Stopwatch sw = new Stopwatch().start(); Does it help to implement a ScopedTimerWithStats (try-with-resources way)? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363 PS3, Line 363: /*numHits=*/hit.getRef() ? 1 : 0, : /*numMisses=*/hit.getRef() ? 0 : 1 this is kinda weird. Maybe overload this method that takes a single bool which calls the helper? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Just felt like it's good practice to treat scopes as nestable and "restore" Ya I think we should have some guard that prevents nested scopes. I doubt if anyone will use it, but still :-) -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 05:27:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11227 ) Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3128/ -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 05:16:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 24: (1 comment) http://gerrit.cloudera.org:8080/#/c/11314/24/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/24/tests/authorization/test_owner_privileges.py@34 PS24, Line 34: from tests.common.impala_test_suite import ImpalaTestSuite flake8: F401 'tests.common.impala_test_suite.ImpalaTestSuite' imported but unused -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 24 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 04:48:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/612/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 24 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 04:46:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER This patch adds calls to automatically create or remove owner privileges in the catalog based on the statement. This is similar to the existing pattern where after privileges are granted in Sentry, they are created in the catalog directly instead of pulled from Sentry. The sentry-site.xml and hive-site.xml template files have been updated to enable usage of object ownership by default for easier development with a secured impala. When object ownership is enabled: CREATE DATABASE will grant the user OWNER privileges to that database. ALTER DATABASE SET OWNER will transfer the OWNER privileges to the new owner. DROP DATABASE will revoke the OWNER privileges from the owner. This will apply to DATABASE, TABLE, and VIEW. Example: If ownership is enabled, when a table is created, the creator is the owner, and Sentry will create owner privileges for the created table so the user can continue working with it without waiting for Sentry refresh. Inserts will be available immediately. Testing: - Created new custom cluster tests for object ownership Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template A fe/src/test/resources/sentry-site_no_oo.xml.template A fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh A tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 31 files changed, 1,044 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/24 -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 24 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 23: (16 comments) http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java@54 PS23, Line 54: authorization is enabled > I got confused by what's supposed to happen if its enabled, but isObjectOwn Done http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1002 PS23, Line 1002: updateDatabasePrivileges(newDb.getName(), null, params.server_name, null, null, > with so many args, it would be easier to read if the params were commented, Done http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1840 PS23, Line 1840: if (user == null) { > looks like there's a race here (two operations see that owner doesn't exist Done http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2878 PS23, Line 2878: .removePrivilege(privilege); > is there a race here as well? Duplicate removes should basically noop the second one. http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3196 PS23, Line 3196: isObjectOwnershipEnabled > after seeing this method, I'm wondering if this can be pulled up to analysi Moved to updateDatabasePrivileges http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template File fe/src/test/resources/sentry-site_no_oo.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template@76 PS23, Line 76: enable owner > in this case, disable? Done http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template File fe/src/test/resources/sentry-site_oo_nogrant.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template@22 PS23, Line 22: sentry.service.security.mode > add comments to highlight which of these settings pertain to "oo" and which Done http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@70 PS23, Line 70: ImpalaTestSuite > is this needed since CustomClusterTestSuite inherits already from ImpalaTes Done http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@81 PS23, Line 81: Using user 'root > I missed this.. where is 'root' explicitly used? Adding a comment at the top to address. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@87 PS23, Line 87: self.execute_query("create role owner_priv_test_ROOT") > can things be simplified if you create a unique id per test in the test mat Adding a comment at the top to address. Basically need "root" because every system the tests run on should have a user root with group root. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@109 PS23, Line 109: self.execute_query("create role owner_priv_admin") : > why are these added when cleanup is supposed to remove things? We create a role and grant it privileges so we can clean up otherwise we might not have access. We don't want to rely on roles being in a state at the end of tests. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156 PS23, Line 156: def test_owner_privileges_with_grant(self, vector): > if you use the unique_database fixture here, all objects (db, table, view) Can't use that since I want to create the database after the cluster starts up. The unique_database creates the database before the test so owner privileges aren't created. I tried adding 'if not exists' to the create and didn't work. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@172 PS23, Line 172: self.root_impalad_client = self.create_impala_client() : s > how do these two differ? Adding a comment at the top to address. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@248 PS23, Line 248: ser="root" > maybe the comment about 'root' up on L81 is relevant here? Addi
[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11380 ) Change subject: IMPALA-5031: Allow UBSAN to be set on codegen .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/611/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560 Gerrit-Change-Number: 11380 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Sep 2018 03:50:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11380 ) Change subject: IMPALA-5031: Allow UBSAN to be set on codegen .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11380/2/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/11380/2/be/src/codegen/llvm-codegen.h@77 PS2, Line 77: // find be/src -type f -execdir sed -i s/DCHECK_EQ\(replaced,\ /DCHECK_REPLACE_COUNT\(replaced,\ /g {} \; line too long (105 > 90) -- To view, visit http://gerrit.cloudera.org:8080/11380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560 Gerrit-Change-Number: 11380 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Sep 2018 03:49:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11158 ) Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. Patch Set 2: Forgot to add, can we add some fe tests to check these fields are getting interned. -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Sep 2018 03:26:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11380 ) Change subject: IMPALA-5031: Allow UBSAN to be set on codegen .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG@12 PS1, Line 12: slows down testing significantly, it is not made the default when > Might be worth mentioning that it only works on C++ code cross-compiled to Done http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h File be/src/common/logging.h: http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h@49 PS1, Line 49: /// Define verbose logging levels. Per-row logging is more verbase than per-file / > This might belong better in a codegen-related header? It feels very specifi Done -- To view, visit http://gerrit.cloudera.org:8080/11380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560 Gerrit-Change-Number: 11380 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Sep 2018 03:12:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11380 to look at the new patch set (#2). Change subject: IMPALA-5031: Allow UBSAN to be set on codegen .. IMPALA-5031: Allow UBSAN to be set on codegen This adds a new buildall options, -full_ubsan, that is stronger than just -ubsan, in that it forces code generated by cross compilation to LLVM IR to use the undefined behavior sanitizer as well. Because this slows down testing significantly, it is not made the default when using -ubsan. Change-Id: I054e78dd172ee140f2095a523595ff030494e560 --- M be/CMakeLists.txt M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/select-node.cc M be/src/exec/topn-node.cc M be/src/exec/union-node.cc M bin/make_impala.sh M buildall.sh 13 files changed, 76 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11380/2 -- To view, visit http://gerrit.cloudera.org:8080/11380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560 Gerrit-Change-Number: 11380 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11158 ) Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java File fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java: http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@62 PS2, Line 62: public static void internFieldsInPlace( nit: for each method, should we list the set of fields that we selectively intern? (here and below internFieldsInPlace()? http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@88 PS2, Line 88: internFieldsInPlace(table.getClustering_columns()); null check for table http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@113 PS2, Line 113: if (sd.isSetCols()) { null check for sd. http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@131 PS2, Line 131: if (fs.isSetName()) { null check http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@155 PS2, Line 155: "-1".equals(val) || : "0".equals(val) || : val.isEmpty() || : "true".equalsIgnoreCase(val) || : "false".equalsIgnoreCase(val) || : "TASK".equals(val) || : val.startsWith("impala_")) Can we create a static HashSet out of these (lowercased) and use something like if (hashSet.contains(val.lower( Easier to extend I think http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@172 PS2, Line 172: e.getKey().intern() Maybe move everything to guava interner to be consistent? Native interning is slow anyway (as per the blog post) http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@174 PS2, Line 174: return ret; Preconditions.checkState(ret.size() == parameters.size())? http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@188 PS2, Line 188: internColumnFamily can we rename it to something like internString() that we can reuse for all Strings? http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@198 PS2, Line 198: // but it's likely someone would trip over one of these. This looks risky if someone updates the thrift def for TNetworkAddress :-) http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java File fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java: http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java@35 PS2, Line 35: columnQualifier just curious, any reason why not this? http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@853 PS2, Line 853: hmsParameters_ = CatalogInterners.internParameters( : msPartition.getParameters()); Shouldn't we do it after we remove the incremental stats ? (or) blacklist incremental stats from getting interned. http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1203 PS2, Line 1203: intern just curious, why not intern everything via Guava? -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Sep 2018 03:11:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11227 ) Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/610/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 02:35:53 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32 PS3, Line 32: breaking down by category? still hoping to get some feedback on the above http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189 PS3, Line 189: private static final String DB_LIST_STATS_CATEGORY = "DatabaseList"; > Is it worth having some sort of enum for these keys? Maybe it will make the yea I think it'll add quite some boiler plate without any real gains in safety since the profiles store this stuff in a string map anyway http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Can you explain what is going on here? Is there a case where oldThreadLocal Just felt like it's good practice to treat scopes as nestable and "restore" the prior state after the scope is popped out of. Currently we have a pretty strict 1:1 query:thread kind of mapping but maybe in the future that might not be the case. I can add a Preconditions.checkState that it starts as null prior to entering the scope if you think that's cleaner until we have some fancier use cases? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 02:22:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Hello Bharath Vissapragada, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11227 to look at the new patch set (#2). Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. IMPALA-7047. Refreshing partitions should not make an RPC per file The code to handle REFRESH of a single partition was incorrectly ignoring the previously-known file descriptors. This meant that, instead of only calling 'getFileBlockLocations' on the files that had changed since the prior load, it was instead calling it on every file. In addition to refresh of single partitions this also affected refresh of unpartitioned tables (which is implemented as a refresh of its single "default" partition). This patch fixes the behavior by copying over the existing file descriptor list into the re-created partition before refreshing it. A new unit test uses FS statistics to verify the change. The new assertions act as a regression test and fail if I comment out the fix. I also tested this by pointing my dev box at a remote filesystem that was approximately 60ms away. The initial load of an unpartitioned table with approximately 45000 files takes around 23 seconds in this setup. Without the patch in place, REFRESH was taking upwards of 35 minutes (I got tired and gave up at this point). Multiplying the 60ms round trip by 45000 files estimates 45 minutes. With the fix in place, REFRESH of the same table took around 4.5 seconds. Clearly, in typical setups where catalogd and HDFS are on a shared local network, the gains won't be so dramatic. But, even with a 1ms round trip (plausible when including fixed RPC overhead and potentially congested datacenter networks) this would save 45 seconds on this example table with 45000 files. Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java 2 files changed, 44 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/11227/2 -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11227 ) Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3128/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 02:06:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11227 ) Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per file .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1256 PS1, Line 1256:* Updates the file metadata of an unpartitioned HdfsTable. > Maybe we should add that this is optimized for incremental load. (Full load Done http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1264 PS1, Line 1264: resetPartitions(); : org.apache.hadoop.hive.metastore.api.Table msTbl = getMetaStoreTable(); : Preconditions.checkNotNull(msTbl); : setPrototypePartition(msTbl.getSd()); : HdfsPartition part = createPartition(msTbl.getSd(), null, new FsPermissionCache()); > ok. Can you add a comment for this? Done http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1269 PS1, Line 1269: > ... Done http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2068 PS1, Line 2068: refreshedPartition.setFileDescriptors(oldPartition.getFileDescriptors()); > same here, can't we directly call refreshpartitionFileMetadata(oldPartition added same comment as above -- To view, visit http://gerrit.cloudera.org:8080/11227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956 Gerrit-Change-Number: 11227 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 01:58:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 7: (8 comments) Just a few small notes remaining from me. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175 PS4, Line 175: icted because o > The call site of this function is guarded with a try clause so I think it's I don't think relying on catching NullPointerException is the right approach, though -- that would be very concerning for an operator to see "Unexpected exception: NullPointerException" in the logs. Under what circumstance does it return null? Perhaps we should return false with a warning if it does? http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@75 PS7, Line 75: AtomicLong triggerCount_ = new AtomicLong(); probably worth a javadoc on this because it wasn't entirely clear to me what it meant. When I first saw it, I thought it was the number of times a table had been invalidated due to this class, but actually it looks like it's just the number of times the background thread woke up and did a scan. Maybe 'scanCount_' would be better? http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@150 PS7, Line 150: if (!(gcbean instanceof NotificationListener)) { I think this is now misplaced -- shouldn't this be inside the loop after the check for 'Old'? (but above the setting of member vars, as Vuk caught) http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@225 PS7, Line 225: "Table " + table.getFullName() + " is invalidated nit: I think "Invalidated table foo.bar" is clearer than "foo.bar is invalidated" (or as you had it before with no "is") http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@273 PS7, Line 273: ly" + nit: missing space http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@274 PS7, Line 274: w nit: capitalize. http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java File fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java: http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@55 PS7, Line 55: atalog_ : .invalidateTable(new TTableName(dbName, tblName), : tblWasRemoved, dbWasAdded); nit: can now reformat this code in a less strange way http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py File tests/custom_cluster/test_automatic_invalidation.py: http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@45 PS7, Line 45: if "Table functional.alltypes is invalidated due to inactivity" in \ We have self.assert_impalad_log_contains which does this in a bit nicer way. Also, I think this loop should eventually give up rather than causing a test timeout if it never shows up. Check out test_restart_catalogd in test_local_catalog.py for an example. -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 01:35:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 07 Sep 2018 01:00:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 7: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/609/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 23:57:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Tianyi Wang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. IMPALA-7448: Invalidate recently unused tables from catalogd This patch implements an automatic invalidation mechanism in catalogd. There are two invalidation strategies: 1. Periodically the HDFS tables that are not used in a configured period "invalidate_tables_timeout_s" is invalidated from catalogd. 2. If the old GC generation is almost full, a certain percentage of LRU tables are invalidated. This can be enabled by backend flag "invalidate_tables_on_memory_pressure". The table usage is reported by impalad to catalogd when the tables are used during planning. Tests on time-based invalidation are added. It is manually verified that the GC callback is called if strings are randomly stuffed into catalogd. Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h M be/src/service/fe-support.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/View.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java A tests/custom_cluster/test_automatic_invalidation.py 28 files changed, 804 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/7 -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 4: (18 comments) http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@19 PS4, Line 19: is > nit: are Done http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@20 PS4, Line 20: randomly stuffed into catalogd > what does this mean? I wrote a thread that kept creating stringbuilders. The notification listener is called and the memory pool names are as expected. http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@210 PS4, Line 210: query > only queries? changed to SQL statements http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@211 PS4, Line 211: evict > just clarifying that this knob and the memory ones are independent. in othe Done http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@212 PS4, Line 212: unused > nit: remove Done http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225 PS4, Line 225: invalidate_tables_fraction_on_memory_pressure > I understand this knob to establish a target lower bound for an eviction ev We don't know how much memory is released until a GC. http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@226 PS4, Line 226: the numbers of > remove Done http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h File be/src/exec/catalog-op-executor.h: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h@72 PS4, Line 72: table names > nit: simplify to just tables Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2000 PS4, Line 2000:* Set the last used time of specified tables to now. > Because of the thrift spec, I was expecting to see usage count being used s Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@46 PS4, Line 46: unused_tbl_ttl_sec > stale? Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@62 PS4, Line 62: shrinking > eviction Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@89 PS4, Line 89: private long lastInvalidationTime; > add an "_". Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@116 PS4, Line 116: BackendConfig.INSTANCE.getInvalidateTablesGcOldGenFullThreshold(), : B > are these two validated anywhere (e.g., [0,1])? added a check http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@140 PS4, Line 140: poolName > is there somewhat stable site to link to where these names are given, perha I can't find any. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@146 PS4, Line 146: if (!(gcbean instanceof NotificationListener)) { > why not do this check before setting various class members above? Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175 PS4, Line 175: getLastGcInfo() > just starting here with the doc for this method, I see that null can be ret The call site of this function is guarded with a try clause so I think it's safe http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@94 PS4, Line 94: Random random = new Random(); > put this outside of the while loop. Done http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@656 PS4, Line 656: load > what rule are you using to pair the load with
[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11355 ) Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING .. Patch Set 5: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/68/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/11355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 Gerrit-Change-Number: 11355 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Thu, 06 Sep 2018 23:00:13 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189 PS3, Line 189: private static final String DB_LIST_STATS_CATEGORY = "DatabaseList"; Is it worth having some sort of enum for these keys? Maybe it will make the code more untidy? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); Can you explain what is going on here? Is there a case where oldThreadLocalValue_ is not null? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/11355 ) Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml File docs/topics/impala_parquet_dictionary_filtering.xml: http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml@87 PS4, Line 87: The NumDictFilteredRowGroups field in the query profile output shows the > We should mention somewhere that row groups can be filtered out by parquet Done http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml File docs/topics/impala_parquet_read_statistics.xml: http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@49 PS4, Line 49: Parquet stores min/max stats which can be used to skip reading blocks if they don't > The stats are (currently) used to skip row groups. "qualify a predicate" so Updated with a different wording http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@61 PS4, Line 61: Of the numerical types: bool, integer, floating point > Impala also supports other data types, but there are restrictions around th Can I have the version info for the old and new statistics? -- To view, visit http://gerrit.cloudera.org:8080/11355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 Gerrit-Change-Number: 11355 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Thu, 06 Sep 2018 22:53:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11355 ) Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING .. Patch Set 5: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/68/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/11355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 Gerrit-Change-Number: 11355 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Thu, 06 Sep 2018 22:53:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING
Hello Lars Volker, Fredy Wijaya, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11355 to look at the new patch set (#5). Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING .. IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 --- M docs/impala.ditamap M docs/topics/impala_parquet.xml A docs/topics/impala_parquet_dictionary_filtering.xml A docs/topics/impala_parquet_read_statistics.xml 4 files changed, 269 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/11355/5 -- To view, visit http://gerrit.cloudera.org:8080/11355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 Gerrit-Change-Number: 11355 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11393 ) Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1194 PS1, Line 1194: initMetrics(); > Doesn't this reset the metrics even for incremental loads? Also doesn't it ah, indeed you're right, I think I need to find a different way to accomplish this patch. I didn't run tests before posting it for review :) -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 Gerrit-Change-Number: 11393 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 06 Sep 2018 22:36:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11393 ) Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1194 PS1, Line 1194: initMetrics(); Doesn't this reset the metrics even for incremental loads? Also doesn't it hit the Preconditions check? -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 Gerrit-Change-Number: 11393 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 21:48:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 23: (16 comments) http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java@54 PS23, Line 54: authorization is enabled I got confused by what's supposed to happen if its enabled, but isObjectOwnershipEnabled is false. What will be the end-state of the various systems under the four combinations here? http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1002 PS23, Line 1002: updateDatabasePrivileges(newDb.getName(), null, params.server_name, null, null, with so many args, it would be easier to read if the params were commented, e.g., (..., /*tableName*/null, ...), in particular for nulls and booleans. http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1840 PS23, Line 1840: if (user == null) { looks like there's a race here (two operations see that owner doesn't exist, one will go first and then the other will overwrite)... I see that addUser will overwrite so the perhaps the race is a no-op. however, it differs from the intent here so just wondering if anything unintended crops up here? for example, two operations for the same user will succeed so externally, will it look like the same user was added twice? if there is an assumption about locking from the caller, pls state it. http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2878 PS23, Line 2878: .removePrivilege(privilege); is there a race here as well? http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3196 PS23, Line 3196: isObjectOwnershipEnabled after seeing this method, I'm wondering if this can be pulled up to analysis? why let one of those alter commands get to this point when sentry object ownership is not enabled? same question for the method below as well. if the main explanation is that something is optional here-- is that spelled out somewhere? http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template File fe/src/test/resources/sentry-site_no_oo.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template@76 PS23, Line 76: enable owner in this case, disable? http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template File fe/src/test/resources/sentry-site_oo_nogrant.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template@22 PS23, Line 22: sentry.service.security.mode add comments to highlight which of these settings pertain to "oo" and which pertain to "nogrant". pls add the same to the just "oo" sentry file as well. I have no idea which of these specify the intent of the file. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@70 PS23, Line 70: ImpalaTestSuite is this needed since CustomClusterTestSuite inherits already from ImpalaTestSuite? http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@81 PS23, Line 81: Using user 'root I missed this.. where is 'root' explicitly used? http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@87 PS23, Line 87: self.execute_query("create role owner_priv_test_ROOT") can things be simplified if you create a unique id per test in the test matrix? I see that all tests are marked to run serially, but in any event, I'm worried about leakage to other tests, or within this test. http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@109 PS23, Line 109: self.execute_query("create role owner_priv_admin") : why are these added when cleanup is supposed to remove things? http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156 PS23, Line 156: def test_owner_privileges_with_grant(self, vector): if you use the unique_database fixture here, all objects (db, table, view) will be cleaned up. The name can be pa
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3126/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 06 Sep 2018 21:41:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11158 ) Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 21:11:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11337 ) Change subject: IMPALA-7335: Fix a race in HdfsScanNode .. Patch Set 5: We should go ahead with the previous fix if the refactor is more complex than expected - sorry to send you down the wrong path there. -- To view, visit http://gerrit.cloudera.org:8080/11337 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 Gerrit-Change-Number: 11337 Gerrit-PatchSet: 5 Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Sep 2018 20:40:48 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-7178: Add the possibility to reduce logging for common data errors
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/10793 ) Change subject: WIP IMPALA-7178: Add the possibility to reduce logging for common data errors .. Patch Set 1: Csaba, any updates here? -- To view, visit http://gerrit.cloudera.org:8080/10793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie3b7c1fd020a7ba5e0d9c619e1b67236dce198aa Gerrit-Change-Number: 10793 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 06 Sep 2018 20:21:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py File tests/custom_cluster/test_rpc_timeout.py: http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42 PS12, Line 42: flake8: E251 unexpected spaces around keyword / parameter equals http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42 PS12, Line 42: flake8: E251 unexpected spaces around keyword / parameter equals -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 06 Sep 2018 18:26:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/608/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 06 Sep 2018 18:23:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11158 ) Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3125/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7522: Fix overflow in milliseconds add()
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11390 ) Change subject: IMPALA-7522: Fix overflow in milliseconds_add() .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I491389257d6560873d942ac4f9a0fd836cc216da Gerrit-Change-Number: 11390 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11393 ) Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3123/ -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 Gerrit-Change-Number: 11393 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11380 ) Change subject: IMPALA-5031: Allow UBSAN to be set on codegen .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG@12 PS1, Line 12: is not made the default when using -ubsan. Might be worth mentioning that it only works on C++ code cross-compiled to LLVM IR. I.e. not on IrBuilder generated IR. You can sort of infer this since IrBuilder code isn't C++ so the concept doesn't apply, but good to be explicit. http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h File be/src/common/logging.h: http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h@49 PS1, Line 49: // The number of function calls replaced is not knowable when UBSAN is enabled, since it This might belong better in a codegen-related header? It feels very specific to that use case. llvm-codegen.h? -- To view, visit http://gerrit.cloudera.org:8080/11380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560 Gerrit-Change-Number: 11380 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Sep 2018 17:49:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 12: (14 comments) http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h File be/src/exec/hdfs-parquet-table-writer.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h@199 PS10, Line 199: > nit: should #include the appropriate .pb.h here ("include-what-you-use") Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@60 PS10, Line 60: const Coordinator& > I think this can probably change back to being const if you take the sugges Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@176 PS10, Line 176: last_report_ti > nit: I think the term "sequence number" is more usual here -- "version" to Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@220 PS10, Line 220: /// Backend exec params, owned by the QuerySchedule and has query lifetime. > This "back pointer" still seems error-prone to me. I think the object lifet Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@267 PS10, Line 267: return num_remaining_instances_ == 0 || !status_.ok(); > I think a VLOG_QUERY about the skipped RPC is probably useful Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@294 PS10, Line 294: DCHECK(!instance_stats->done_); > nit: why not: The ctor was marked explicit so not sure it's allowed: "explicit Status(const StatusPB& status);" http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287 PS10, Line 287: atus = report > Ah, I missed that we join on the reporter thread first. Good idea about using DFAKE_MUTEX(). Also switched to using a non-atomic. Also simplified the logic in Coordinator::BackendState::ApplyExecStatusReport() as we can rely purely on the sequence number as you suggested. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@375 PS10, Line 375: ReportExecStatusResponsePB resp; > should we have a failure injection point on the RPC itself? I only saw fail Please find the tests in test_rpc_timeout.py which: 1. inject delays in the RPC handler to induce timeout 2. run with a very short service queue to emulate a busy server. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@379 PS10, Line 379: reak; > should we backoff? I will refrain from changing the logic here too much. There will be a follow up patch after IMPALA-4063 which will change the retry logic. TODO added. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc@202 PS10, Line 202: } > the method doc says that new_errors is cleared, but it's actually written i This was lost after refactoring this function. Fixed now. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/service/control-service.cc File be/src/service/control-service.cc: PS10: > This is a general krpc-in-Impala question: I can't find where you set up au Very good point. This is definitely a bug and it's now fixed in this commit here (https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d) http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h File be/src/util/uid-util.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h@79 PS10, Line 79: DCHECK(uid_pb.IsInitialized()); > worth DCHECKs here that the fields are set by calling uid_pb.IsInitialized( Done http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh@562 PS10, Line 562: export HBASE_CONF_DIR="$IMPALA_FE_DIR/src/test/resources" > why's this necessary? Can we change cmake to invoke it from the full path i FindProtobuf should have set PROTOBUF_PROTOC_EXECUTABLE. Not sure why I needed to set it before. http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py@97 PS10, Line 97: > can we change this flag to be in millis instead of seconds? Or do we advert I don't
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#12). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 59 files changed, 1,155 insertions(+), 695 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/12 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7516: Fix query location accounting
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11372 ) Change subject: IMPALA-7516: Fix query location accounting .. Patch Set 1: Code-Review+2 (1 comment) Thanks for fixing. Looks to to submit after the flake warning and my comment are addressed. http://gerrit.cloudera.org:8080/#/c/11372/1/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/11372/1/tests/custom_cluster/test_admission_controller.py@674 PS1, Line 674: admission results that can affect it.""" Maybe reference the JIRA that this is a regression test? -- To view, visit http://gerrit.cloudera.org:8080/11372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c8e59299747df57c9f39db1cb1f46ff6bbab7e Gerrit-Change-Number: 11372 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Sep 2018 17:40:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7519: Support elliptic curve ssl ciphers
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11376 ) Change subject: IMPALA-7519: Support elliptic curve ssl ciphers .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/607/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11376 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6 Gerrit-Change-Number: 11376 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 17:39:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 8: Looks like the compile-only job didn't trigger for this. I'm confident it's just whitespace so I'll just commit. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 17:27:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. IMPALA-7203. Support UDFs in CatalogdMetaProvider This adds support to fetch the list of function names within a database, and then to fetch the list of overloads for a given function name. These items are cached on the coordinator and invalidated when minimal function objects are seen on the minimal catalog topic stream. Aside from the straight-forward plumbing of this new RPC, it's worth noting that this patch changes the MetaProvider interface to provide Impala Functions directly instead of HMS Function objects. This means that we will now fully support all the types of functions supported in legacy catalogs. As such, test_udfs now passes. Making this change simplified the code in LocalDb but also means that DirectMetaProvider no longer supports fetching functions. When we move to trying to eliminate catalogd altogether at some point, we can revive this code. For now, I just throw an exception for the function-related code in DirectMetaProvider, as it's unused. The one other notable thing here is that TFunction now has optional fields where it used to have required ones. This was necessary in order to send invalidations as TCatalogObjects. Given that TFunction is an internal-only construct, this shouldn't raise compatibility issues. Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Reviewed-on: http://gerrit.cloudera.org:8080/11359 Reviewed-by: Todd Lipcon Tested-by: Todd Lipcon --- M common/thrift/CatalogService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Function.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java 9 files changed, 232 insertions(+), 170 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 9 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7519: Support elliptic curve ssl ciphers
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11376 to look at the new patch set (#2). Change subject: IMPALA-7519: Support elliptic curve ssl ciphers .. IMPALA-7519: Support elliptic curve ssl ciphers Thrift's SSLSocketFactory class does not support setting ciphers that use ecdh. This patch modifies our existing subclass of SSLSocketFactory to override the ciphers() method and enable ECDH. The code for this was taken from be/src/kudu/security/tls_context.cc Testing: - Added a custom cluster test that verifies that a cluster with only ECDH ciphers enabled works. Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6 --- M be/src/rpc/thrift-server.cc M tests/custom_cluster/test_client_ssl.py 2 files changed, 49 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/11376/2 -- To view, visit http://gerrit.cloudera.org:8080/11376 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6 Gerrit-Change-Number: 11376 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 8: Verified+1 Code-Review+2 Checked diff between r7 and r8 and it's just docs changes and some whitespace/comment diff from a revised parent patch. I'll commit this once the build-only check comes back. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 16:54:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 22: (1 comment) http://gerrit.cloudera.org:8080/#/c/11314/22/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/22/tests/authorization/test_owner_privileges.py@249 PS22, Line 249: " flake8: E501 line too long (92 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 22 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 16:51:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11393 ) Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3123/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 Gerrit-Change-Number: 11393 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 16:28:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/606/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 22 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 16:21:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 23: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/605/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 23 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 16:11:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/11396 ) Change subject: IMPALA-7542: find-fragment-instances misses to find the "root threads" .. Patch Set 1: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-7542: find-fragment-instances misses to find the "root threads" nit: Please make the short description about what this change does, not what the bug was, e.g. "fix find-fragment-instances to find all root threads". http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG@19 PS1, Line 19: I tested it locally on a core file. I think it would be great to have a test for this script, both to show how to use it and because it's somewhat hard to test. We don't need to add one in this change though, but I'll file a subsequent jira. http://gerrit.cloudera.org:8080/#/c/11396/1/lib/python/impala_py_lib/gdb/impala-gdb.py File lib/python/impala_py_lib/gdb/impala-gdb.py: http://gerrit.cloudera.org:8080/#/c/11396/1/lib/python/impala_py_lib/gdb/impala-gdb.py@47 PS1, Line 47: v nit: Can you rename this variable to something more descriptive, maybe tdi? I can't think of what "v" means. -- To view, visit http://gerrit.cloudera.org:8080/11396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6 Gerrit-Change-Number: 11396 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 06 Sep 2018 16:03:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has uploaded a new patch set (#23). ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER This patch adds calls to automatically create or remove owner privileges in the catalog based on the statement. This is similar to the existing pattern where after privileges are granted in Sentry, they are created in the catalog directly instead of pulled from Sentry. The sentry-site.xml and hive-site.xml template files have been updated to enable usage of object ownership by default for easier development with a secured impala. When object ownership is enabled: CREATE DATABASE will grant the user OWNER privileges to that database. ALTER DATABASE SET OWNER will transfer the OWNER privileges to the new owner. DROP DATABASE will revoke the OWNER privileges from the owner. This will apply to DATABASE, TABLE, and VIEW. Example: If ownership is enabled, when a table is created, the creator is the owner, and Sentry will create owner privileges for the created table so the user can continue working with it without waiting for Sentry refresh. Inserts will be available immediately. Testing: - Created new custom cluster tests for object ownership Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template A fe/src/test/resources/sentry-site_no_oo.xml.template A fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh A tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 31 files changed, 1,038 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/23 -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 23 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER This patch adds calls to automatically create or remove owner privileges in the catalog based on the statement. This is similar to the existing pattern where after privileges are granted in Sentry, they are created in the catalog directly instead of pulled from Sentry. The sentry-site.xml and hive-site.xml template files have been updated to enable usage of object ownership by default for easier development with a secured impala. When object ownership is enabled: CREATE DATABASE will grant the user OWNER privileges to that database. ALTER DATABASE SET OWNER will transfer the OWNER privileges to the new owner. DROP DATABASE will revoke the OWNER privileges from the owner. This will apply to DATABASE, TABLE, and VIEW. Example: If ownership is enabled, when a table is created, the creator is the owner, and Sentry will create owner privileges for the created table so the user can continue working with it without waiting for Sentry refresh. Inserts will be available immediately. Testing: - Created new custom cluster tests for object ownership Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template A fe/src/test/resources/sentry-site_no_oo.xml.template A fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh A tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 31 files changed, 1,040 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/22 -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 22 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 20: (9 comments) http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift@77 PS22, Line 77: // The server name for security privileges when authorization enabled. > nit: is Done http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift@131 PS22, Line 131: // The server name for security privileges when authorization enabled. > nit: is (several more below) Done http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2660 PS22, Line 2660: Checks for authorization being enabled and returning the appropriate value for :* server name. > simplify: Returns the server name if authorization is enabled. Returns null Done. Yes. http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1005 PS22, Line 1005: > nit: use /** comments. Done http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/Frontend.java@488 PS22, Line 488: == TPrincipalType.USER) { > pull up to prev. line Makes the line length 92. http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/util/SentryPolicyService.java File fe/src/main/java/org/apache/impala/util/SentryPolicyService.java: http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/util/SentryPolicyService.java@485 PS22, Line 485: entry > Sentry? .. what's 'entry'? Done http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template File fe/src/test/resources/mysql-hive-site.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template@a116 PS22, Line 116: > I was going to comment on why I don't see this file removed but I don't see It's been unused as far as I know in development. It appears in the documentation so may be used in production. http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template@115 PS22, Line 115: hive.sentry.conf.url > why this name change in this patch? Because hive.access.conf.url doesn't work. impalad.INFO:W0906 10:27:26.005672 25891 HiveConf.java:4043] HiveConf of name hive.access.conf.url does not exist http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py@250 PS20, Line 250: __root_query_fail > The change is quite different than what I had in mind. Done -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 20 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 15:33:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 16: (4 comments) http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@592 PS16, Line 592: (uint64_t)MICROS_PER_SEC > Is there a reason to use static_cast for primitive numeric types? IMO C sty AFAIK we are following the google style guide here: https://google.github.io/styleguide/cppguide.html#Casting The problem with C-style casting is that it is too strong, e.g. it can cast pointers to integers. However, the google style guide allows brace-initialization, which is as brief as a C-style cast and only does conversion, not casting. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241 PS16, Line 241: 1000 > Attila mentioned this too. I think that 1000 is easier to read than MILLIS_ It's usually better to use named constants because they express the intent better. Here it is trivial why we need 1000 so I don't feel strong about it. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@188 PS16, Line 188: (int64_t)1000*1000*1000 > I was not happy to write this here, but walltime.h constants are not availa nit: I see, btw you could use C++14 digit separators: 1'000'000'000 http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@318 PS16, Line 318: int64_t SplitTime(int64_t* ticks) > I see the point, and I trust the compiler that it will eliminate the variab Using different variables lets you use more names to express the calculation better, i.e. to make the code easier to read. E.g. in this example 'nanos' means something different before and after the function call. That's my view, but I'm OK if you don't agree with it and keep it as it is. -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 16 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 06 Sep 2018 12:29:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11379 ) Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3122/ -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongjun Zhang Gerrit-Comment-Date: Thu, 06 Sep 2018 11:50:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11396 ) Change subject: IMPALA-7542: find-fragment-instances misses to find the "root threads" .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/604/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6 Gerrit-Change-Number: 11396 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 06 Sep 2018 11:12:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11396 Change subject: IMPALA-7542: find-fragment-instances misses to find the "root threads" .. IMPALA-7542: find-fragment-instances misses to find the "root threads" find-fragment-instances didn't show all the threads that worked on some fragment instance. It was because it checked the fragment instance of the parent thread instead of the actual thread, therefore missing the "root threads" that started working on the fragment instance. I modified the get_fragment_instances() function to check the local ThreadDebugInfo object of the threads. I tested it locally on a core file. Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6 --- M lib/python/impala_py_lib/gdb/impala-gdb.py 1 file changed, 4 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/11396/1 -- To view, visit http://gerrit.cloudera.org:8080/11396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6 Gerrit-Change-Number: 11396 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11379 ) Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/603/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongjun Zhang Gerrit-Comment-Date: Thu, 06 Sep 2018 08:48:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11379 ) Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/11379/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/11379/2/be/src/exec/hdfs-parquet-scanner.cc@1215 PS2, Line 1215: int64_t metadata_start = file_len - sizeof(int32_t) - sizeof(PARQUET_VERSION_NUMBER) - line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@406 PS2, Line 406: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@406 PS2, Line 406: """ line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@407 PS2, Line 407: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@408 PS2, Line 408: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@409 PS2, Line 409: ; flake8: E703 statement ends with a semicolon -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongjun Zhang Gerrit-Comment-Date: Thu, 06 Sep 2018 08:45:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11379 ) Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3122/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongjun Zhang Gerrit-Comment-Date: Thu, 06 Sep 2018 08:19:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Yongjun Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/11379 ) Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1165 PS1, Line 1165: sca > Now that there are two variables storing some kind of length it would be go Done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1214 PS1, Line 1214: // file_len - 4-byte footer length field - 4-byte version number field - metadata size : int64_t metadata_start = file_len - sizeof(int32_t) - > not your change but can you please fix the line wrapping while you're here? Done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1228 PS1, Line 1228: const HdfsFileDesc* file_desc = scan_node_->GetFileDesc(partition_id, filename()); > If both are NULL we won't catch it now. Are we sure that cannot happen? Sho Thanks a lot for the review Lars. The assumption is that stream_->file_desc() is not null, if it's null, the call to get file length before this branch would have failed already. If we want, we could add a DCHECK in the beginning of the function. I added the DCHECK here for multiple reasons: 1. only the logic in the existing code here tries to find file_desc. 2. this branch is an "unlikely" branch. Code out side this branch has no need to find file_desc. 3. the partition_id variable is only used in this branch. Another version of my code did move these few lines to before this branch, but I chose to move it here because otherwise it's going to be some additional work for other code that doesn't need to reach this "unlikely" branch and doesn't use the partition_id variable. What do you think? http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1267 PS1, Line 1267: "at file of > nit: wrap after file_length done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/testdata/data/corrupt_footer_len_decr.parquet File testdata/data/corrupt_footer_len_decr.parquet: http://gerrit.cloudera.org:8080/#/c/11379/1/testdata/data/corrupt_footer_len_decr.parquet@1 PS1, Line 1: PAR1L Ò , &n%c^f&6&6 (Ò Ò , ,Hschema %c%$ &n%c^f&6&6 (Ò Ò , f (Oimpala version 2.11.0-SNAPSHOT (build 26b2cc85d2084e9e464669b2a67a8015ede173e1) ½ PAR1 > Add a description for both test files to the README file in the same folder done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@401 PS1, Line 401: def corrupt_footer_len_common(self, vector, unique_database, testname_postfix): > The function should have a comment. Please find a more descriptive name for Done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@412 PS1, Line 412: create_table_and_copy_files(self.client, > nit: single line Done in new rev. http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@420 PS1, Line 420: > nit: capitalization Done in new rev. -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongjun Zhang Gerrit-Comment-Date: Thu, 06 Sep 2018 08:11:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.
Yongjun Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11379 Change subject: IMPALA-6442: Misleading file offset reporting in error messages. .. IMPALA-6442: Misleading file offset reporting in error messages. The error message described in IMPALA-6442 incorrectly reported the file offset where the Parquet footer starts, as if the offset is counted from the file end instead of from the file beginning. The fix changed the reported file offset to be counted from the beginning of the Parquet file. Testing: Create a small table that contains one row of data with a single column that's bigint and store it as Parquet. Manually changed the footer size field to be 1) smaller than the original footer size by 1, to trigger the error message fixed by this jira to be printed, to verify that the fix functions correctly; 2) bigger than the file size, thus to trigger another related error message to be printed. Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de --- M be/src/exec/hdfs-parquet-scanner.cc M testdata/data/README A testdata/data/corrupt_footer_len_decr.parquet A testdata/data/corrupt_footer_len_incr.parquet M tests/query_test/test_scanners.py 5 files changed, 65 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/11379/2 -- To view, visit http://gerrit.cloudera.org:8080/11379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de Gerrit-Change-Number: 11379 Gerrit-PatchSet: 2 Gerrit-Owner: Yongjun Zhang Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong